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Abstract 

Inefficiencies  in  the  bureaucratic  organization  of  the  state  are  often  viewed  as  important 
factors  in  retarding  economic  development.  Why  certain  societies  choose  or  end  up  with  such 
inefficient  organizations  has  received  very  little  attention,  however.  In  this  paper,  we  present  a 
simple  theory  of  the  emergence  and  persistence  of  inefficient  states.  The  society  consists  of  rich 
and  poor  individuals.  The  rich  are  initially  in  power,  but  expect  to  transition  to  democracy, 
which  will  choose  redistributive  policies.  Taxation  requires  the  employment  of  bureaucrats. 
We  show  that,  under  certain  circumstances,  by  choosing  an  inefficient  state  structure,  the 
rich  may  be  able  to  use  patronage  and  capture  democratic  politics.  This  enables  them  to 
reduce  the  amount  of  redistribution  and  public  good  provision  in  democracy.  Moreover,  the 
inefficient  state  creates  its  own  constituency  and  tends  to  persist  over  time.  Intuitively,  an 
inefficient  state  structure  creates  more  rents  for  bureaucrats  than  would  an  efficient  state 
structure.  When  the  poor  come  to  power  in  democracy,  they  will  reform  the  structure  of  the 
state  to  make  it  more  efficient  so  that  higher  taxes  can  be  collected  at  lower  cost  and  with 
lower  rents  for  bureaucrats.  Anticipating  this,  when  the  society  starts  out  with  an  inefficient 
organization  of  the  state,  bureaucrats  support  the  rich,  who  set  lower  taxes  but  also  provide 
rents  to  bureaucrats.  We  show  that  in  order  to  generate  enough  political  support,  the  coalition 
of  the  rich  and  the  bureaucrats  may  not  only  choose  an  inefficient  organization  of  the  state, 
but  they  may  further  expand  the  size  of  bureaucracy  so  as  to  gain  additional  votes.  The  model 
shows  that  an  equilibrium  with  an  inefficient  state  is  more  likely  to  arise  when  there  is  greater 
inequality  between  the  rich  and  the  poor,  when  bureaucratic  rents  take  intermediate  values 
and  when  individuals  are  sufficiently  forward-looking. 

Keywords:  bureaucracy,  corruption,  democracy,  patronage  politics,  political  economy, 
public  goods,  redistributive  politics. 
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1      Introduction 

There  are  large  cross-country  differences  in  the  extent  of  bureaucratic  corruption  and  the 
efficiency  of  the  state  organization  (e.g.,  World  Bank,  2004).  An  influential  argument,  dating 
back  at  least  to  Tilly  (1990),  maintains  that  differences  in  "state  capacity"  are  an  important 
determinant  of  economic  development.1  The  evidence  that  many  less-developed  economies  in 
sub-Saharan  Africa,  Asia  and  Latin  America  only  have  a  small  fraction  of  their  GDP  raised 
in  tax  revenue  and  invested  by  the  government  (e.g.,  Acemoglu,  2005)  and  the  correlation 
between  measures  of  state  capacity  and  economic  growth  (e.g.,  Rauch  and  Evans,  2000)  are 
also  consistent  with  this  view.  Societies  with  limited  state  capacity  are  also  those  that  invest 
relatively  little  in  public  goods  and  do  not  adopt  policies  that  redistribute  resources  to  the 
poor.2  Brazil  provides  a  typical  example  of  a  society,  where  the  state  sector  has  been  relatively 
inefficient  and  democratic  politics  has  generated  only  limited  public  goods  and  benefits  for  the 
poor  (e.g.,  Gay,  1990,  Evans,  1992,  Weyland  1996,  Roett,  1999). 

In  this  paper,  we  construct  a  political  economy  model,  which  links  the  emergence  and 
persistence  of  inefficient  states  to  the  strategic  use  of  patronage  politics  by  the  elite  as  a  means 
of  capturing  democratic  politics.  Democratic  capture  enables  the  elite  to  limit  the  provision 
of  public  goods  and  redistribution,  but  at  the  cost  of  aggregate  inefficiencies.  Our  approach 
therefore  provides  a  unified  answer  both  to  the  question  of  why  inefficient  states  emerge  in 
some  societies  and  why  many  democracies  pursue  relatively  pro-elite  policies.  It  also  suggests 
why  certain  democracies  may  exhibit  relatively  poor  economic  performance  and  adopt  various 
inefficient  policies.3 

Our  model  economy  consists  of  two  groups,  the  rich  elite  and  poor  citizens.  Linear  taxes  can 
be  imposed  on  both  groups,  with  the  proceeds  used  to  finance  public  good  investments.  The 
rich  are  generally  opposed  to  high  levels  of  taxes  and  public  good  investments.  Tax  collection 
requires  that  the  state  employs  bureaucrats  to  prevent  individuals  from  evading  taxes,  but 
bureaucrats  themselves  also  need  to  be  given  incentives  so  that  they  exert  effort  (or  do  not 
accept  bribes).  The  efficiency  with  which  a  central  authority  can  monitor  the  bureaucrats  is  our 
measure  of  the  organization  of  the  state.  Political  competition  is  modeled  either  by  assuming 
the  existence  of  two  parties,  respectively  aligned  with  the  rich  and  the  poor,  or  by  allowing  free 


'See,  for  example,  Evans  (1989,  1995),  Levi  (1989),  Migdal  (1988),  Epstein  (2000),  Herbst  (2000),  Centeno 
(2002)  and  Kohli  (2004). 

See,  for  example,  Etzioni-Halevy  (1983)  on  the  importance  of  state  capacity  and  bureaucratization  for  the 
development  of  the  welfare  state  in  the  West,  and  Rothstein  and  Uslaner  (2005)  on  the  importance  of  state 
capacity  for  income  redistribution. 

On  the  comparative  post-war  growth  performance  of  democracies,  see,  for  example,  Barro  (1999). 


entry  into  the  political  arena  by  citizen  candidates  (Osborne  and  Slivinski,  1996,  Besley  and 
Coate,  1997).  In  both  cases,  there  is  no  commitment  to  policies  before  elections  and  the  party 
that  comes  to  power  chooses  the  policy  vector,  including  taxes,  public  good  provision,  and 
bureaucratic  wages,  and  whether  to  reform  the  efficiency  of  the  state  institutions.  Democratic 
political  competition  is  made  interesting  by  the  fact  that  bureaucrats  may  support  either  the 
rich  or  poor  parties  (candidates)  and  their  support  may  be  pivotal  in  the  outcome  of  elections. 

We  consider  two  possible  organizations  of  the  state:  the  first  is  an  "efficient"  organization, 
in  which  bureaucrats  will  be  detected  easily  if  they  fail  to  exert  effort,  while  the  second  is  an 
"inefficient"  one  in  which  monitoring  bureaucrats  is  difficult.  In  equilibrium,  when  the  state  is 
inefficient,  bureaucrats  need  to  be  paid  rents  in  order  to  induce  them  to  perform  their  roles  of 
tax  collection  and  inspection.  The  presence  of  rents  creates  the  possibility  of  patronage  politics, 
whereby  bureaucrats  may  support  the  party  that  will  maintain  the  inefficient  structure.4 

In  a  society  that  is  always  dominated  by  the  rich  elite  or  that  is  permanently  in  democracy 
(with  a  poor  citizen  as  the  median  voter),  the  political  process  produces  an  efficient  organization 
of  bureaucracy,  since  an  inefficient  state  creates  additional  costs  and  no  benefits  for  those 
holding  power.  Our  main  result  is  that  when  the  society  starts  out  as  nondemocratic  (under 
the  control  of  the  rich  elite)  and  is  expected  to  transition  to  democracy,  the  rich  may  find  it 
beneficial  to  choose  an  inefficient  organization  of  the  state  so  as  to  exploit  patronage  politics  to 
limit  redistribution.  In  particular,  bureaucrats  realize  that  once  the  poor  median  voter  comes 
to  power  in  democracy,  there  will  be  bureaucratic  reform,  reducing  their  rents  from  then  on. 
Therefore,  if  the  rich  elite,  when  in  power,  choose  an  inefficient  organization  of  the  state, 
the  current  bureaucrats — who  are  receiving  rents — prefer  to  support  the  rich  rather  than  vote 
with  the  poor.  Consequently,  an  inefficient  state  organization  emerges  as  a  political  instrument 
for  the  rich  elite  to  capture  the  democratic  decision-making  process  by  fostering  a  coalition 
between  themselves  and  the  bureaucrats.  It  is  also  noteworthy  that  the  inefficient  state  not 
only  emerges  in  equilibrium,  but  also  persists;  when  the  state  is  inefficient,  the  bureaucrats 
vote  for  the  party  of  the  rich,  which  chooses  not  to  reform  the  bureaucracy  and  continues  to 
maintain  the  support  of  the  existing  bureaucrats  and  thus  its  political  power. 

Our  analysis  shows  that  patronage  politics  typically  leads  not  only  to  the  emergence  and 
persistence  of  an  inefficient  state  apparatus,  but  also  to  the  overemployment  of  bureaucrats. 


4 In  our  basic  model,  the  assumption  that  the  main  role  of  bureaucrats  is  tax  inspection  is  not  essential.  The 
important  feature  is  that  an  inefficient  state  organization  must  pay  bureaucrats  rents  in  order  to  provide  them 
with  the  right  incentives.  Bureaucrats'  role  as  tax  inspectors  becomes  important  in  the  extension  in  subsection 
5.3,  where  they  can  be  bribed  by  producers  evading  taxes.  We  simplify  the  presentation  by  assuming  that 
bureaucrats'  main  role  is  tax  inspection  throughout. 


This  is  because  the  rich  may  prefer  to  hire  additional  (unnecessary)  bureaucrats  so  as  to  boost 
their  party's  votes.  Consequently,  a  captured  democracy  will  typically  feature  an  inefficient 
state  (bureaucracy),  provide  relatively  few  public  goods  and  employ  an  excessive  number  of 
bureaucrats.  This  pattern  of  bureaucratic  inefficiency  is  consistent  with  the  stylized  view  of 
corrupt  and  low-capacity  bureaucracies  in  many  less  developed  countries  (e.g.,  Geddes,  1991, 
and  Rauch  and  Evans,  2000). 5 

We  also  show  that  the  equilibrium  with  an  inefficient  state  is  more  likely  when  there  is 
greater  inequality.  This  is  because  greater  inequality  raises  the  equilibrium  tax  rate  in  democ- 
racy and  makes  it  more  appealing  for  the  rich  to  create  an  inefficient  state  apparatus  to  prevent 
democratic  outcomes.  An  inefficient  state  also  requires  intermediate  levels  of  rents/  "efficiency 
wages"  for  bureaucrats;  when  rents  are  limited,  bureaucrats  would  not  support  the  rich,  while 
too  high  rents  would  make  the  inefficient  state  equilibrium  prohibitively  costly  for  the  rich 
elite.  Finally,  an  inefficient  state  is  more  likely  to  arise  when  agents  are  more  forward-looking, 
because  bureaucrats  support  the  inefficient  state  in  order  to  obtain  future  rents. 

The  rest  of  the  paper  is  organized  as  follows.  Section  2  provides  a  brief  discussion  of  a 
number  of  case  studies  that  illustrate  how  patronage  politics  has  been  used  to  limit  redis- 
tribution towards  the  poor  and  also  discusses  the  related  literature.  Section  3  outlines  the 
basic  economic  and  political  environment.  Section  4  characterizes  the  equilibria  of  the  baseline 
model  both  under  permanent  nondemocratic  and  democratic  regimes  as  benchmarks,  and  more 
importantly,  under  a  regime  that  starts  out  as  nondemocratic  and  becomes  democratic  there- 
after. We  show  that  in  this  last  political  environment  the  rich  elite  may  choose  an  inefficient 
state  organization  and  a  sufficiently  large  bureaucracy  in  order  to  create  a  majority  coalition. 
Section  5  generalizes  this  framework  in  a  number  of  directions;  in  particular,  it  allows  for  more 
general  contracts  with  bureaucrats,  considers  a  citizen-candidate  setup  for  political  competi- 
tion, and  allows  producers  to  bribe  bureaucrats  to  evade  taxes.  Section  6  briefly  investigates  a 
distinctive  implication  of  our  approach  about  the  relationship  between  relative  wages  of  public 
sector  employees  and  the  amount  of  public  good  provision  in  democracies.  We  report  cross- 
country correlations  consistent  with  this  implication.  Section  7  concludes,  while  the  Appendix 
contains  some  of  the  proofs  omitted  from  the  text. 


5  Even  with  the  overemployment  of  bureaucrats,  bureaucrats  and  the  rich  elite  may  not  have  an  absolute 
majority  in  the  electorate.  In  practice,  the  elites  may  be  able  to  control  the  political  system  using  other 
methods  such  as  lobbying  in  addition  to  the  support  of  the  bureaucrats.  Here,  we  isolate  our  main  mechanism 
by  focusing  on  a  baseline  model  where  the  rich  are  able  to  capture  democracy  without  any  lobbying  or  other 
non-electoral  activities. 


2     Motivation  and  Related  Literature 

In  this  section,  we  briefly  discuss  a  number  of  case  studies  that  motivate  our  analysis  and  also 
relate  our  paper  to  the  existing  literature  in  political  economy. 

2.1      Patronage  Politics,  Inefficient  States  and  Elite  Control 

The  experiences  of  many  societies  in  Latin  America,  Asia  and  Africa  illustrate  the  link  between 
patronage  politics,  inefficient  states  and  elite  control.  Here  we  briefly  mention  three  cases. 

Perhaps  the  most  transparent  example  of  inefficient  and  oversized  bureaucracy  comes  from 
Brazil.  Several  authors  (e.g.  Gay,  1990,  Weyland  1996,  Roett,  1999)  have  argued  that  the 
distribution  of  large  numbers  of  public  jobs,  both  in  the  public  administration  and  in  paras- 
tatal  organizations,  has  created  a  pattern  of  patronage  politics  in  Brazil.6  The  control  over 
these  jobs  has  enabled  traditional  elites  to  preserve  their  political  power  and  limit  the  amount 
of  public  good  provision  and  redistribution.  In  fact,  despite  the  high  level  of  inequality  in 
Brazil,  elites  have  been  able  to  control  politics  for  much  of  the  20t/l  century  with  only  limited 
amount  of  repression  and  relatively  short  periods  of  military  rule.  Interestingly,  the  amount 
of  redistribution  and  public  good  provision  does  not  show  marked  differences  between  military 
and  democratic  periods. 

Patronage  politics  has  often  ensured  that  even  those  in  poorest  neighborhoods  of  Rio  have 
supported  the  traditional  parties  rather  than  socialist  or  social  democratic  parties  running 
on  platforms  of  greater  public  good  provision  and  redistribution  (Gay,  1990).  Students  of 
Brazilian  politics  have  noted  the  role  of  public  sector  employees  in  this  process.  For  example, 
Roett  (1999  p.  91)  writes  "state  company  employees  emerged  as  being  among  the  strongest 
supporter  of  the  patrimonial  order" .  In  return,  successive  governments  have  withstood  external 
pressures  from  the  IMF  and  have  not  reformed  the  public  sector,  despite  the  "public  perception 
that  public-sector  workers  were  overpaid  and  underworked"  (Roett,  p.  97).  The  process  of 
reforming  the  public  sector  has  started  only  recently  and  progressed  slowly. 

Another  example  of  effective  patronage  politics  is  provided  by  the  policies  of  Parti  So- 
cialiste  (PS)  of  President  Leopold  Sedar  Senghor  in  Senegal.  After  independence,  PS  faced 
increasing  challenges  from  various  different  opposition  groups,  including  urban  workers  and 
farmers.    Nevertheless,  it  managed  to  preserve  its  power,  with  relatively  limited  amount  of 


In  the  early  1980s,  about  4  million  people  had  a  job  in  some  branch  of  the  Brazilian  public  sector.  Evans 
(1992)  observes  that  the  Brazilian  state  is  commonly  recognized  as  a  huge  cabide  de  emprego  (source  of  jobs)  and 
remarks  that  in  contrast  to  the  Weberian  conception,  recruitment  of  public  employees  in  Brazil  is  not  related 
to  merit  but  to  political  connections. 


repression,  largely  owing  to  its  use  of  patronage  politics.  In  fact,  Senghor  promoted  some 
amount  of  political  liberalization  and  allowed  the  creation  of  a  multi-party  system.  However, 
PS  also  exploited  its  incumbency  advantage  to  manipulate  the  democratic  process  by  creating 
an  extensive  patronage  network  centered  on  the  state  apparatus  and  parastatal  sector.  Inter- 
estingly, it  was  precisely  during  this  period  of  democratization  that  the  size  of  the  public  sector 
grew  substantially  (Boone  1990,  Beck,  1997).  Boone  (1990  p.  347)  describes  this  process  as: 
"The  strategic  allocation  of  government  jobs  coopted  restive  intellectuals  and  professionals  and 
incorporated  them  into  political  factions  anchored  in  the  state  bureaucracy."  Thanks  to  this 
successful  implementation  of  patronage  politics,  PS  retained  much  of  its  power  following  the 
transition  to  democracy. 

A  final  example  of  the  rise  of  patronage  politics  in  the  face  of  political  competition  comes 
from  Italy.  The  evolution  of  the  Italian  bureaucracy  in  the  post- WWII  decades  demonstrates 
that  the  mechanism  that  our  model  identifies  may  operate  even  in  relatively  developed  coun- 
tries. A  significant  extension  in  the  Italian  bureaucracy  was  initiated  by  the  Italian  Christian 
Democratic  Party  (DC)  in  the  1950s  after  the  electoral  challenge  from  the  Communist  Party 
increased  sharply,  especially  following  the  1953  political  elections.  Until  the  defeat  in  World 
War  II,  Italian  politics  was  dominated  by  Mussolini's  dictatorship.  After  the  war,  DC  emerged 
as  the  dominant  party.  In  the  1950s,  faced  by  electoral  challenge  from  the  left,  DC  created  a 
highly  disorganized  and  oversized  bureaucracy,  which  subsequently  became  a  natural  source  of 
political  support  and  patronage  for  the  party.  Golden  (2003  p.  199)  describes  the  motive  for 
the  expansion  of  the  bureaucracy  as: 

"The  massive  system  of  political  patronage  that  the  leaders  of  the  DC  constructed 
after  1953  was  their  aggregate  answer  ...  to  enlarging  the  party's  aggregate  vote 
share  while  protecting  the  incumbency  advantage  of  individual  legislators." 

In  part  with  the  support  of  the  bureaucracy  that  it  created,  DCs  dominance  of  Italian  politics 
continued  until  the  1980s  and  prevented  the  formation  of  a  left-wing  government. 

2.2     Related  Literature 

Our  paper  is  related  to  a  number  of  different  literatures.  The  first  is  the  political  science  and 
sociology  literature  on  the  organization  of  the  state  and  the  bureaucracy  mentioned  above. 
In  contrast,  there  is  relatively  little  work  on  the  internal  organization  of  the  state  and  bu- 
reaucracy in  economics.  Some  exceptions  include  Acemoglu  and  Verdier  (1998),  Dixit  (2002), 
Egorov  and  Sonin  (2005)  and  Debs  (2006).  None  of  these  papers  investigate  the  relationship 


between  patronage  politics  and  the  emergence  of  the  inefficient  state  as  a  method  of  limiting 
redistribution. 

Our  paper  is  also  related  to  a  number  of  other  strands  of  the  literature  in  political  economy. 
First,  the  reason  why  the  elite  both  initially  and  later  on  choose  inefficient  institutions  is 
to  control  political  power  in  a  democratic  regime.  As  such,  our  paper  is  related  to  other 
models  of  elites  manipulating  policies  in  democratic  settings,  including  lobbying  models,  such 
as  Austen-Smith  (1987),  Baron  (1994),  Dixit,  Grossman  and  Helpman  (1997),  and  Grossman 
and  Helpman  (1996),  and  models  in  which  traditional  elites  are  able  to  capture  democratic 
politics,  e.g.,  Acemoglu  and  Robinson  (2006b). 

The  small  literature  on  the  inefficiency  of  the  form  of  redistribution  is  also  closely  related 
to  our  work.  Becker  and  Mulligan  (2003)  and  Wilson  (1990)  argue  that  inefficient  methods 
of  redistribution  are  chosen  as  a  way  of  limiting  the  amount  of  redistribution  (see  also  Coate 
and  Morris,  1995,  Rodrik,  1995).  There  is  a  close  connection  between  this  idea  and  the  main 
mechanism  in  our  paper,  whereby  an  inefficient  state  is  chosen  by  the  rich  in  order  to  limit  the 
amount  of  future  redistribution.  Nevertheless,  there  is  also  an  important  distinction;  in  the 
basic  Becker-Mulligan- Wilson  story,  it  is  not  clear  why  the  society  can  commit  to  the  form  of 
redistribution  and  not  to  the  amount  of  redistribution.  In  contrast,  in  our  model  the  choice  of 
an  inefficient  bureaucracy  is  a  way  of  affecting  the  future  political  equilibrium  so  as  to  bring 
the  party  aligned  with  the  interests  of  the  rich  to  political  power,  and  via  this  channel,  to 
limit  the  provision  of  public  goods  and  taxation.  As  such,  our  mechanism  is  also  related  to 
the  rationale  for  inefficient  redistribution  suggested  in  Saint-Paul  (1996)  and  Acemoglu  and 
Robinson  (2001),  where  a  politically  powerful  group  may  push  for  inefficient  forms  of  transfers 
in  order  to  maintain  its  future  political  power. 

There  is  also  a  small  literature  on  how  politicians  may  distort  policies  for  strategic  rea- 
sons. Papers  in  this  literature  include  models  where  inefficient  policies  (such  as  excessive  state 
employment)  are  chosen  in  order  to  gain  votes  (e.g.,  Fiorina  and  Noll,  1978,  Geddes,  1991, 
Shleifer  and  Vishny,  1994,  Lizzeri  and  Persico,  2001,  Robinson  and  Torvik,  2005).  Still  other 
papers  suggest  that  inefficient  choices  (including  wasteful  investments,  large  budget  deficits, 
and  inefficient  fiscal  systems)  are  made  in  order  to  constrain  future  politicians  (e.g.,  Glazer, 
1989,  Persson  and  Svensson,  1989,  Tabellini  and  Alesina,  1990,  Aghion  and  Bolton,  1990, 
Cukierman,  Edwards,  and  Tabellini,  1992).  None  of  these  papers  feature  the  mechanism  of  an 
elite  creating  an  inefficient  state  structure  to  maintain  their  political  power  in  the  face  of  an 
emerging  democracy. 

Our  model  is  also  related  to  the  literature  on  comparative  politics  and  public  finance 


(e.g.,  Persson  and  Tabellini,  2003,  Ticchi  and  Vindigni,  2005),  which  investigates  sources  of 
differences  in  fiscal  policies  among  democracies.  Our  approach  suggests  an  alternative,  but 
complementary,  source  of  variation,  related  to  the  desire  and  the  ability  of  the  economic  elite 
to  dominate  democratic  politics,  which  can  generate  both  differences  in  the  level  of  public 
goods  provision  and  in  the  efficiency  of  the  state. 

Finally,  our  approach  is  related  to  sociological  analyses  of  "cooptation"  in  democracy  by 
existing  elites  in  the  Marxist  sociology  and  political  science  literatures.  In  particular,  Therborn 
(1980  pp.  228-234)  argues  that  the  control  of  the  state  apparatus  is  a  crucial  objective  of  the 
economic  elites  in  democracy,  which  they  achieve  using  strategies  including  cooptation  (see 
also  the  discussion  of  hegemony  in  Gramsci,  1971).  However,  this  literature  neither  articulates 
a  mechanism  through  which  the  elite  may  accomplish  these  objectives  nor  models  the  costs  of 
such  a  strategy  relative  to  other  options.7 

3     Basic  Model 

3.1     Description  of  the  Economic  Environment 

Consider  the  following  discrete  time  infinite-horizon  economy  populated  by  a  continuum  1  of 
agents,  each  of  which  has  the  following  risk-neutral  preferences 

oo 

BqJ2  P*  (4 +  Gt-hef), 
t=o 

at  time  t  =  0,  where  Eo  is  the  expectations  at  time  t  =  0,  @  6  (0, 1)  is  the  discount  factor, 
Cj  >  0  denotes  the  consumption  of  the  agent  in  question  (agent  j),  Gt  >  0  is  the  level  of 
public  good  enjoyed  by  all  agents,  e?t  €  {0, 1}  is  the  effort  decision  of  the  agent  (which  will  be 
necessary  in  some  occupations),  and  h  >  0  is  the  cost  of  effort. 

There  are  two  types  of  agents:  n  >  1/2  are  poor  (low-skill),  while  1  —  n  are  rich  (high-skill). 
We  denote  poor  agents  by  the  symbol  L  (corresponding  to  low-productivity),  and  rich  agents 
by  H,  and  also  use  C  and  Ti  to  denote  the  set  of  poor  and  rich  agents. 

There  are  two  occupations:  producer  and  bureaucrat.  In  each  period,  as  long  as  some 
amount  of  investment  in  infrastructure,  K  >  0,  is  undertaken,  each  producer  generates  an 
income  depending  on  his  skill;  AL  for  poor  agents  and  AH   >  AL  for  rich  agents.    If  the 


Another  major  difference  between  the  Marxist  approaches  and  ours  is  that  in  our  model  bureaucrats  can 
side  either  with  rich  or  poor  agents,  whereas  in  most  Marxist  approaches,  the  state  apparatus  is,  ultimately, 
controlled  by  the  economic  elite  (e.g.,  Miliband,  1969,  Poulantzas,  1978,  Therborn,  1980).  In  this  respect,  the 
notion  of  bureaucracy  and  state  apparatus  in  our  model  is  also  different  from  that  of  Max  Weber,  which  views 
bureaucracy  as  an  "apolitical"  organization,  with  no  goals  or  interests.  See  also  Alford  and  Friedland  (1985)  for 
a  critical  discussions  of  Marxist  and  non-Marxist  theories  of  the  state. 


investment  in  infrastructure  K  is  not  undertaken  at  time  t,  then  no  agent  can  produce  within 
that  period.  Producers  receive  and  consume  their  income  net  of  taxes. 

A  set  of  agents  denoted  by  Xt  are  bureaucrats  at  time  t.  These  agents  do  not  produce, 
but  receive  a  net  wage  of  wt  >  0  from  the  government  (i.e.,  they  do  not  pay  taxes  on  their 
wage  income).  The  role  of  bureaucrats  is  tax  collection.  In  particular,  we  will  allow  for  a 
linear  tax  rate  rt  6  [0, 1]  on  earned  incomes  in  order  to  finance  the  infrastructure  investment 
K,  additional  spending  on  the  public  good  Gt  and  the  wages  of  bureaucrats.  This  tax  rate 
is  the  same  irrespective  of  whether  the  individual  is  rich  or  poor.  To  simplify  the  discussion, 
we  assume  that  only  poor  agents  can  become  bureaucrats.  This  assumption  is  not  necessary 
for  the  results,  since  it  will  be  evident  below  that  low-productivity  poor  agents  always  prefer 
bureaucracy  more  than  the  high-productivity  rich  agents  (see  Remark  2  below). 

Both  rich  and  poor  agents  can  try  to  evade  taxes.  We  assume  that  if  an  individual  tries 
to  evade  taxes,  he  gets  caught  with  probability  p(xt),  where  p  :  [0, 1]  — ►  [0, 1]  is  an  increasing, 
twice  continuously  differentiable,  and  strictly  concave  function  with  p  (0)  =0,  and  xt  denotes 
the  number  of  bureaucrats  exerting  positive  effort  at  time  t.  More  formally,  this  is  defined  as 
Xt  =  Lfx  etdj-  This  expression  incorporates  the  fact  that  bureaucrats  who  do  not  exert  effort 
are  not  useful.8 

If  an  individual  is  caught  evading  taxes,  all  of  his  income  during  that  period  is  lost.  For 
simplicity,  we  assume  that  this  income  does  not  accrue  to  the  government  either  (though 
this  is  not  an  important  assumption).  We  also  assume  that  there  is  full  anonymity  in  the 
market,  so  that  the  past  history  of  individual  producers  is  not  observed.  This  implies  that 
future  punishments  on  tax  evaders  are  not  possible.  Moreover,  because  of  limited  liability,  i.e., 
Cj  >  0,  more  serious  punishments  are  not  possible. 

Since  effort  is  costly,  bureaucrats  will  exert  effort  only  if  their  compensation  depends  on 
their  effort  decision.  We  assume  that  if  they  do  not  exert  effort,  bureaucrats  are  caught  with 
probability  qt  at  time  t.  If  they  are  not  caught,  they  receive  the  wage  wt,  and  if  they  are 
caught  shirking,  they  lose  their  wage,  but  are  not  fired  from  the  bureaucracy.  This  assumption 
simplifies  the  algebra  and  the  exposition  considerably  and  is  relaxed  in  subsection  5.1  below. 

The  probability  of  detection  qt  depends  on  the  quality  of  the  organization  of  the  state 
("efficiency  of  the  state").  In  particular,  we  allow  two  levels  of  efficiency,  It  €  {0, 1},  such  that 
q{I  =  1)  =  1,  so  that  with  an  efficient  organization  of  the  state  any  shirking  bureaucrat  is 


Alternatively,  instead  of  inducing  bureaucrats  to  exert  effort,  it  may  be  important  to  ensure  that  they  do 
not  accept  bribes  from  the  individuals  supposed  to  pay  taxes  (e.g.,  Acemoglu  and  Verdier,  1998,  2000).  We 
investigate  a  variant  of  our  model  with  corruption  in  subsection  5.3. 


immediately  caught,  while  q  (I  =  0)  =  qo  <  I,  so  that  with  an  inefficient  organization  shirking 
bureaucrats  are  not  necessarily  detected.  To  simplify  the  analysis  we  assume  that  1=1  has 
no  cost  relative  to  /  =  0.9 

At  each  date,  the  political  system  chooses  the  following  policies: 

•  A  tax  rate  on  all  earned  income  rt  6  [0, 1]. 

•  The  wage  rate  for  bureaucrats  wt  €  R+. 

•  A  level  of  public  good  Gt  6  K+. 

•  The  number  of  bureaucrats  hired,  Xt  €  [0, 1]. 

•  A  decision  on  the  organization  of  the  state  for  the  next  date,  It  G  {0, 1} — the  efficiency 
of  the  state  at  the  current  date,  It-i,  is  part  of  the  state  variable,  determined  by  choices 
in  the  previous  period. 

The  additional  restrictions  on  these  policies  are  as  follows: 

1.  The  government  budget  constraint  (specified  below)  has  to  be  satisfied  at  every  date. 

2.  If  Xt  >  Xt-i,  then  existing  bureaucrats  cannot  be  fired  (although  each  bureaucrat  can 
decide  to  quit  if  he  finds  this  beneficial).  Moreover,  if  Xt  <  Xt-i,  then  no  new  bureaucrats 
are  hired  and  a  fraction  (Xt  —  Xt-\)  /Xt  of  the  bureaucrats  is  fired  (those  fired  being 
randomly  chosen  irrespective  of  their  past  history). 

We  denote  a  vector  of  policies  satisfying  these  restrictions  by  pt  =  (rt,wt,  Gt,  Xt,  It)  6  71. 

3.2     Description  of  the  Political  System 

We  will  consider  three  different  political  environments: 

1.  Permanent  nondemocracy:  the  rich  elite  are  in  power  at  all  dates,  meaning  that  only 
the  rich  can  vote,  and  since  all  rich  agents  have  the  same  policy  preferences  over  the 
available  set  of  policies,  the  policy  vector  most  preferred  by  a  representative  rich  elite 
will  be  implemented. 


In  general,  one  can  imagine  that  setting  up  a  more  efficient  state  apparatus  may  involve  additional  expen- 
ditures. We  ignore  those  both  to  simplify  the  algebra  and  also  to  highlight  that  inefficient  states  can  arise  even 
when  an  efficient  organization  is  costlessly  available. 


2.  Permanent  Democracy:  the  citizens,  who  form  the  majority,  are  in  power  at  all  dates 
starting  at  t  =  0  (or  at  all  dates  there  are  elections  as  described  below). 

3.  Emerging  Democracy:  the  rich  elite  are  in  power  at  t  =  0,  and  in  all  future  dates,  the 
regime  will  be  democratic  with  majoritarian  elections. 

The  first  two  environments  are  for  comparison.  The  third  one  is  our  main  focus  in  this 
paper.  It  is  a  simple  way  of  capturing  the  idea  that  some  decisions  are  originally  taken  by 
elites,  anticipating  that  democracy  will  arrive  at  some  point — in  this  case  right  at  date  t  =  l.10 

To  start  with,  we  model  the  democratic  system  in  a  very  simple  way,  by  assuming  that  there 
are  two  parties,  one  run  by  an  elite  agent  and  one  run  by  a  poor  agent,  and  that  bureaucrats 
cannot  run  for  office.  We  use  the  symbols  P  and  R  to  denote  these  parties  and  dt  =  P 
denotes  that  party  P  is  elected  to  office  at  date  t.  Parties  are  unable  to  make  commitments 
to  the  policies  they  will  implement  once  they  come  to  power.  Thus  whichever  party  receives 
the  majority  of  the  votes  comes  to  power  and  the  agent  in  control  of  the  party  chooses  the 
policy  vector  that  maximizes  his  own  utility.  This  last  assumption  departs  from  the  standard 
Downsian  models  of  political  competition  where  parties  commit  to  their  policy  platform  before 
the  election.  Instead,  it  is  closer  to  the  literature  on  citizen-candidate  models,  which  will  be 
discussed  further  in  subsection  5.2  (see  also  Alesina,  1988).  Specifically,  in  subsection  5.2, 
we  will  consider  a  richer  model  of  democratic  politics,  where  each  agent  can  run  as  a  citizen- 
candidate,  and  we  will  show  that  the  same  results  apply  with  this  richer  setup.  Nevertheless, 
it  is  useful  to  start  with  the  simpler  environment  with  only  two  parties  to  highlight  the  main 
economic  forces. 

3.3     Timing  of  Events 

To  recap,  the  timing  of  events  within  each  date  is: 

•  The  society  starts  with  some  political  regime,  nondemocracy  or  democracy,  i.e.,  st  £ 
{N,  D} ,  a  set  Xt-i  C  C  of  agents  who  are  already  bureaucrats  (since,  by  assumption, 
the  set  of  bureaucrats  Xt-\  must  be  a  subset  of  the  set  of  poor  agents),  and  a  level  of 
efficiency  of  the  state,  It-\  €  {0, 1}.  Then: 


10In  this  case,  the  society  is  nondemocratic  at  date  t  =  0,  and  we  assume  that  it  will  become  democratic  for 
exogenous  reasons  at  date  t  =  1.  It  is  possible  to  model  democratization  as  equilibrium  institutional  change  along 
the  lines  of  the  models  of  endogenous  democratization  in  the  literature  (see  Acemoglu  and  Robinson,  2006a,  for 
a  discussion  and  references),  but  doing  so  would  complicate  the  analysis  without  generating  additional  economic 
insights  in  the  current  context. 
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1.  In  democracy,  all  individuals  j  G  [0, 1]  vote  for  either  party  P  or  party  R,  i.e.,  individual 
j  decides  vj  G  {P,R}. 

2.  In  democracy,  the  elected  party  or  in  nondemocracy  the  representative  elite  agent  decides 
the  policy  vector  pt  =  (Tt,wt,Gt,Xt,It)  G  K. 

3.  Observing  this  vector,  each  individual  j  £  Xt~\  decides  whether  to  apply  to  become  a 
bureaucrat,  Xt  G  {0, 1}  ,  and  each  individual  j  G  Xt-\  decides  whether  to  quit  bureau- 
cracy, Xf  G  {0,  1}  (which  is  denoted  by  the  same  symbol  without  any  risk  of  confusion). 
Naturally,  by  assumption,  \l  ~  0  f°r  an  *ne  "cn  agents.  The  number  of  bureaucrats  at 
time  t  is  then  min  <  Xt,  J0  Xtdj  \ ,  i-e.,  the  minimum  of  the  number  of  bureaucrats  chosen 
by  the  polity  in  power  and  the  number  of  people  applying  to  or  remaining  in  bureaucracy. 
This  also  determines  the  current  set  of  bureaucrats,  Xt. 

4.  Each  bureaucrat  decides  whether  to  exert  effort,  e{  €  {0,1},  which  determines  xt  = 
f-pX  e{dj,  and  thus  the  probability  of  detection  of  individuals  evading  taxes. 

5.  Production  takes  place  and  each  producer  decides  whether  to  evade  taxes  or  not,  denoted 
by  4  G  {0,1}. 

6.  A  fraction  p  (x()  of  producers  evading  taxes  are  caught. 

7.  A  fraction  qt  =  q  {It-lj  of  shirking  bureaucrats  are  caught  and  punished. 

8.  Taxes  are  collected,  remaining  bureaucrats  are  paid  their  wage,  wt,  and  the  public  good 
Gt  is  supplied. 

Naturally,  the  society  starts  with  X-\  =  0,  i.e.,  in  the  initial  date  there  are  no  incumbent 
bureaucrats.  We  also  suppose  that  I-\  =  0  (though  this  has  no  bearing  on  any  of  our  results 
except  the  actions  at  time  t  =  0,  since  the  choice  of  It  G  {0, 1}  is  without  any  costs). 

4      Characterization  of  Equilibria 

We  now  characterize  the  equilibrium  of  the  environments  described  above. 

4.1     Definition  of  Equilibrium 

Throughout,  we  focus  on  pure  strategy  Markov  Perfect  Equilibria  (MPE).11 


We  focus  on  MPE  both  because  in  the  current  context  the  MPE  is  unique  and  is  relatively  straightforward 
to  characterize  and  also  because  the  focus  on  MPE  makes  the  emergence  of  a  coalition  between  the  rich  and  the 
bureaucrats  more  difficult  (since  there  cannot  be  "commitment"  to  future  rents  for  bureaucrats). 

11 


Recall  that  Markovian  strategies  condition  only  on  the  payoff-relevant  state  variables  (and 
on  the  prior  actions  within  the  same  stage  game).  An  MPE  is  defined  as  a  set  of  Markovian 
strategies  that  are  best  responses  to  each  other  given  every  history.  In  the  current  game,  the 
aggregate  state  vector  can  be  represented  as  St  =  (st,  It-\,Xt-\)  G  S,  where  st  G  {N,  D}  is  the 
political  regime  at  time  t,  It-\  G  {0, 1}  is  the  efficiency  of  the  bureaucracy  inherited  from  the 
previous  period,  and  Xj_i  is  the  size  of  the  bureaucracy  inherited  from  the  previous  period.12 
Individual  actions  will  be  a  function  of  the  aggregate  state  vector  St  and  the  individual's 
identity,  in  particular,  at  G  {L,  H,  B}  representing  whether  the  individual  is  a  poor  producer, 
rich  producer  or  a  bureaucrat.  Thus  as  a  function  of  St  and  at,  each  individual  will  decide 
which  party  to  vote  for,  i.e.,  v\  G  {P,R},  whether  to  apply  (or  to  remain)  in  bureaucracy, 
Xt  G  {0, 1},  whether  to  evade  taxes,  z\  G  {0, 1},  if  the  individual  is  a  producer,  and  whether 
to  exert  effort,  e\  6  {0, 1},  if  the  individual  is  a  bureaucrat.  Finally,  strategies  also  include 
the  choice  of  It  G  {0, 1},  Tt  G  [0, 1],  Xt  G  [0,  n],  and  Gt  G  M+  when  the  individual  is  the  party 
leader.  Thus  Markovian  strategies  can  be  represented  by  the  following  mapping 

a  :  S  x  {L,  H,  B)  -»  {P,  R}  x  {0,  l}3  x  [0, 1]  x  [0,  n]  x  R+. 

An  MPE  is  a  mapping  a*  that  is  best  response  to  itself  at  every  possible  history. 

We  will  often  refer  to  subcomponents  of  a  rather  than  the  entire  strategy  profile,  and  with 
a  slight  abuse  of  notation,  we  will  use  v  (I  \  a)  to  denote  the  voting  strategy  of  an  individual 
of  group  a  G  {L,  H,  B}  as  a  function  of  the  efficiency  of  the  state  institutions.  Moreover, 
when  there  is  no  risk  of  confusion,  we  will  use  the  index  j  to  denote  individuals  or  groups 
interchangeably. 

4.2     Preliminary  Results 

We  now  state  a  number  of  results  that  will  be  useful  throughout  the  analysis. 

Lemma  1  If  p(xt)  <  rt,  then  z{  ~  0  for  all  j  £  Xt,  i-e.,  all  producers  evade  taxes  at  time  t. 

Proof.  Write  the  payoff  of  an  individual  producer  j  ^  Xt  at  time  t  when  the  tax  rate  is  rt 
and  the  size  of  (effort-exerting)  bureaucracy  is  xt  as 

V}  =  max  { (1  -  Tt)  A*,  (1-p  (xt))  A*  }  +  Gt  +  /3V->+1  (a*) , 


In  addition,  for  each  individual  we  could  specify  whether  the  individual  is  currently  a  bureaucrat,  i.e., 
whether  j  6  Xt-i  and  whether  he  is  a  party  leader  as  part  of  the  individual-specific  state  vector.  Nevertheless, 
Markovian  strategies  can  be  defined  without  doing  this,  which  simplifies  the  notation. 
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where  A3  is  the  productivity  of  this  individual,  and  a*  is  the  optimal  policy,  so  that  0Vt3+1  (a*) 
is  the  discounted  optimal  continuation  value  for  the  individual.  The  max  incorporates  two 
terms.  The  first,  (1  —  Tt)  A3,  is  what  the  individual  will  consume  if  he  pays  a  fraction  rt  of 
his  income  in  taxes.  The  second,  (1  —  p(xt))  A3,  is  his  expected  consumption  when  evading 
taxes.  In  particular,  in  this  case,  the  individual  takes  home  his  full  productivity  A3  with 
probability  (1  —p(xt)),  but  is  caught  and  loses  all  his  current  income  with  probability  p(xj). 
Limited  liability  implies  c\  >  0  and  the  current  behavior  has  no  effect  on  the  continuation  value 
PVt+i  (a*)  §iven  the  anonymity  assumption.  This  expression  immediately  establishes  that  the 
max  term  will  pick  tax  evasion,  i.e.,  z\  —  0,  iip(xt)  <  Tt,  as  claimed  in  the  lemma.    ■ 

Since  with  tax  evasion  there  is  no  government  revenue,  Lemma  1  implies  that  in  equilibrium 
we  need  to  have  the  following  incentive  compatibility  constraint  for  producers 

p{xt)  >  n 
to  be  satisfied.  Alternatively,  defining 

TTtosp-'V),  (1) 

producers'  incentive  compatibility  constraint  can  be  expressed  as:13 

xt  >  tt  (tj)  ■  (2) 

This  condition  requires  the  number  of  bureaucrats  exerting  effort  to  be  greater  than  7r(rt). 
This  constraint  is  sufficient  to  ensure  that  all  individuals  choose  not  to  evade  taxes. 

It  can  be  easily  verified  that  since  p(-)  is  strictly  increasing,  continuously  differentiable 
and  strictly  concave,  tt  (■)  defined  in  (1)  is  strictly  increasing,  continuously  differentiable  and 
strictly  convex. 

Lemma  2  // 

wt  <  -,  (3) 

It 

then  e\  =  0  for  all  j  G  Xt,  i.e.,  all  bureaucrats  will  shirk  at  time  t. 

Proof.  Write  the  payoff  of  a  bureaucrat  j  G  Xt  at  time  t  when  the  wage  rate  is  wt  and  the 
detection  probability  is  qt  as 

V}  -  max  {wt  -h,(l-  qt)  wt}  +  Gt  +  PVtj+1  (a*) , 


13  This  condition  can  also  be  interpreted  as  a  "state  capacity  constraint"  since,  given  the  effective  size  of  the 
bureaucracy,  it  determines  the  maximum  tax  rate. 
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where  a*  is  the  optimal  policy,  so  that  PVf+1  (a*)  is  the  discounted  optimal  continuation  value 
for  the  individual.  The  max  operator  incorporates  two  terms  representing  the  payoff  to  exerting 
effort  and  receiving  the  wage  for  sure,  wt  —  h,  and  the  payoff  to  shirking.  Since,  by  assumption, 
bureaucrats  cannot  be  fired  for  shirking  and  limited  liability  makes  sure  that  c?t  >  0,  the  payoff 
to  shirking  is  (1  —  qt)  Wt-  Whenever  wt  <  h/qt,  the  max  operator  will  pick  the  second  term,  so 
that  we  have  e\  =  0  for  all  j  £  Xt  as  claimed  in  the  lemma.    ■ 

Remark  1  In  subsection  5.1  below,  we  will  allow  bureaucrats  caught  shirking  to  be  fired  from 
bureaucracy.  In  this  case,  it  is  clear  that  the  optimal  contract  involves  firing  a  bureaucrat  if  he 
is  caught  shirking.  Given  this,  the  condition  in  Lemma  2  will  have  to  be  forward-looking,  taking 
into  account  the  future  rents  that  the  bureaucrat  will  lose  if  caught  shirking.  In  particular, 
imagine  a  stationary  equilibrium,  where  today  and  in  all  future  periods  the  tax  rate  is  equal 
to  f,  the  wage  rate  for  bureaucrats  is  w,  and  the  probability  of  getting  caught  is  q,  then  the 
necessary  condition  (3)  would  become 

w-h  (l-f)AL       .         „,  /         ^w-h 


1-/3       ^      1-/3  v        y/V         -1-PJ  ' 

since  the  left-hand  side  is  what  the  individual  would  receive  by  exerting  effort  at  every  date, 
whereas  the  right-hand  side  is  the  payoff  to  deviating  for  one  period,  and  then  switching 
to  exerting  effort  from  then  on  (implicitly  using  the  one-step  ahead  deviation  principle,  see 
Fudenberg  and  Tirole,  1991,  Chapter  4).  In  particular,  the  right-hand  side  has  the  individual 
getting  caught  with  probability  q,  receiving  nothing  today  and  the  wage  of  a  low-skill  producer 
from  then  on,  and  not  getting  caught  with  probability  1  —  q,  in  which  case  he  receives  w  today 
and  then  receives  the  discounted  version  of  the  left-hand  side  (as  he  switches  back  to  exerting 
effort).  A  bureaucrat  who  loses  his  job  always  receives  the  wage  of  a  low-skill  producer  from 
then  on,  since  along  the  equilibrium  path,  there  will  be  no  further  hiring  into  bureaucracy. 
Rearranging  terms,  the  above  inequality  can  be  expressed  as: 

,<fl(1.^  +  kM.  (4) 

In  a  stationary  equilibrium  where  bureaucrats  are  fired  when  caught  shirking,  condition  (4)  will 
replace  (3),  and  when  it  is  satisfied,  all  bureaucrats  will  shirk.  Correspondingly,  the  incentive 
compatibility  constraint,  (5),  below  will  change  to  the  converse  of  this  condition.  We  return 
to  a  further  analysis  of  this  case  in  subsection  5.1. 

If  bureaucrats  are  expected  to  shirk,  all  individuals  will  evade  taxes  and  there  will  be  no 
tax  revenues.  Consequently,  the  infrastructure  investment  K  could  not  be  financed  and  there 
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would  be  no  production.  Thus,  the  society  also  needs  to  satisfy  the  incentive  compatibility 
constraint  of  the  bureaucrats  given  by 

wt  >  -,  (5) 

Qt 

where  qt  —  q(It-i).  This  constraint  is  necessary  and  sufficient  to  ensure  that  all  bureaucrats 

choose  to  exert  effort.  In  addition,  (poor)  individuals  must  prefer  to  become  bureaucrats.  That 

is,  the  participation  constraint 

wt>(l-Tt)AL  +  h,  (6) 

needs  to  be  satisfied  so  that  bureaucrats  receive  at  least  as  much  as  they  would  obtain  in 
private  production. 

Remark  2  If  rich  agents  could  become  bureaucrats,  the  equivalent  participation  constraint, 
corresponding  to  (6),  for  rich  agents  would  be 

wt>  (l-Tt)AH  +  h. 

Comparison  of  this  inequality  with  condition  (6)  makes  it  clear  that  poor  agents  are  always 
more  willing  to  enter  bureaucracy  than  rich  agents.  Our  assumption  that  rich  agents  cannot 
become  bureaucrats  therefore  enables  us  to  avoid  imposing  explicit  conditions  to  ensure  that 
this  inequality  is  not  satisfied  and  (6)  is. 

The  above  discussion,  in  particular  Lemmas  1  and  2,  immediately  establishes  the  following 
lemma  (proof  omitted) : 

Lemma  3  In  any  MPE,  conditions  (2),  (5)  and  (6)  must  hold  and  e\  =  1  for  all  j  6  Xt  and 
all  t,  and  z\  =  l  for  all  j  £  Xt  and  all  t. 

In  other  words,  in  any  equilibrium  the  incentive  compatibility  constraints  of  producers 
and  bureaucrats  and  the  participation  constraint  of  bureaucrats  are  satisfied,  and  no  producer 
evades  taxes  and  all  bureaucrats  exert  effort. 

Prom  Lemma  3  (and  the  fact  that  only  poor  agents  become  bureaucrats),  it  immediately 
follows  that,  as  long  as  the  constraints  (2)  and  (5)  are  satisfied,  the  government  budget  con- 
straint can  be  written  as: 

K  +  Gt  +  wtXt  <  (1  -  n)  rtAH  +  (n  -  Xt)  rtAL,  (7) 

where  the  left-hand  side  is  government  expenditures,  consisting  of  the  investment  in  infrastruc- 
ture, spending  on  public  goods  and  bureaucrats'  wages,  while  the  right-hand  side  is  government 
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tax  receipts  collected  from  rich  and  poor  agents.  This  expression  takes  into  account  that  all 
bureaucrats  exert  effort  and  no  producer  evades  taxes.  Moreover,  (7)  highlights  that  in  our 
model,  taxation  reduces  output  through  a  particular  general  equilibrium  mechanism;  the  gov- 
ernment can  raise  taxes  only  by  hiring  bureaucrats  and  bureaucrats  themselves  do  not  produce 
any  output. 

Finally,  the  following  lemma  is  immediate  and  is  stated  without  proof: 

Lemma  4  Rich  agents  always  vote  for  party  R,  i.e.,  for  all  j  G  7i,  v\  —  R,  and  poor  producers 
always  vote  for  party  P,  i.e.,  for  all  j  &  C  and  j  ^  Xt—i,  v\  =  ^>- 

4.3     Equilibria  under  Permanent  Democracy  and  Nondemocracy 

Equilibria  under  permanent  democracy  and  permanent  nondemocracy  are  of  interest  as  a  com- 
parison to  our  main  political  environment,  which  involves  the  society  starting  as  nondemocratic 
and  then  transitioning  to  democracy.  The  following  results  are  straightforward: 

Proposition  1  Under  permanent  democracy,  there  exists  a  unique  MPE.  In  this  equilibrium, 
at  each  t  >  0  dt  =  P  and  the  following  policy  vector  is  implemented  at  each  t  >  0: 

It  =  1,  vh  =  (1  -  td)  AL  +  h,    Xt  =  ir  (td)  , 

Gt  =  GD  =  (1  -  n)  tdAh  +  [n  -  7T  (td))  rDAL  -  [(l  -  rD)  AL  +  h]ir  (rD)  -  K,       (8) 
and  t     is  the  unique  solution  to  the  maximization  problem: 

max(l -t)Al  +  G  (9) 

r,G  v  ' 

subject  to 
G    =     (1-ti)tAh +  {n-ir(T)]TAL-  [(1  -  t)  AL  +  h]ir  {t)  -  K. 

Proof.  By  Lemma  4,  for  all  j  G  C,  v\  =  P.  Under  permanent  democracy,  the  poor  can 
vote  and  form  the  majority  starting  at  t  =  0,  thus  dt  —  P  for  all  t.  Then  the  payoff  to  the 
decisive  voter  j'  G  C  can  be  written  as 

Vf  ^(l-^At  +  Gt+pV^io*),' 

where  again  a*  is  the  optimal  policy  and  (3Vt\i  (a-*)  is  the  discounted  optimal  continuation 
value  for  this  individual.  The  continuation  value  /3Vt3+1  (a*)  is  unaffected  by  current  policies, 
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thus  the  optimal  policy  can  be  determined  as  a  solution  to  the  following  program: 

max     (1-t)Al  +  G  (10) 

t,w,X,I,G 

subject  to 

7r(T)      <      X 

maxi  — — —  ,(1  -r)AL  +  h\     <    w 

G    <     (1  -  re)  tAh  +  [re  -  X]  tAl  -wX-K 
0    <    G. 

It  is  evident  that  It  =  1  relaxes  the  second  constraint  relative  to  it  =  0j  so  will  always  be 
chosen  in  all  periods  t  >  0.  Moreover,  there  cannot  be  a  solution  in  which  any  one  of  the 
first  three  constraints  is  slack  (since  this  would  allow  an  increase  in  G,  raising  the  value  of  the 
objective  function),  so  we  have  X  =  it  (t)  and  w  =  max  {/i,  (1  —  r)  AL  +  h\  =  (1  —  r)  AL  +  h. 
Substituting  these  equalities  yields  (9)  for  all  periods  where  It  =  1,  i.e.,  for  all  t  >  0.  Strict 
convexity  of  ir  (•)  then  ensures  that  r°  is  uniquely  defined.    ■ 

Proposition  2  Under  permanent  nondemocracy,  there  exists  a  unique  MPE.  In  this  equilib- 
rium, the  following  policy  vector  is  implemented  at  each  t  >  0: 

It  =  1,  wt  =  (1  -  rN)  AL  +  h,    Xt  =  n  (tn)  ,    Gt  -  GN  =  0, 

and  t     is  the  unique  solution  to  the  equation 

[(1  -  r)  AL  +  h]  n  (r)  -  (1  -  re)  tAh  -'[re -tt  (r)]  rAL  +  K  =  0.  (11) 

Proof.  Under  permanent  nondemocracy,  the  rich  retain  political  power  forever.  Then  the 
payoff  to  the  representative  rich  individual  j'  £  7i  can  be  written  as 

Vl  =  (1  -  r)  AH  +  Gt  +.0V&1  (O, 

where  cr*  is  the  optimal  policy  and  /3V/+1  (u*)  is  the  discounted  optimal  continuation  value  for 
this  individual.    Because  the  continuation  value  /3V^+1  (a*)  is  unaffected  by  current  policies, 
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the  optimal  policy  can  be  determined  as  a  solution  to  the  following  program: 

max     (1-  t)Ah  +  G  (12) 

t,w,X,I,G 

subject  to 

7t(t)       <      X 


max    - —  Al-T)Ah  +  h)     <    w 

UW  J 

G    <     (1—  n)  tAh  +  [n  -  X]  tAl  -  wX  -  K 
0    <     G. 

It  is  again  evident  that  It  —  1  relaxes  the  second  constraint  relative  to  It  =  0,  so  will  always 
be  chosen.  Moreover,  the  first  three  constraints  must  again  hold  as  equalities,  so  we  have 
X  =  it  (t)  and  w  =  max  {/i,  (1  —  r)  ylL  +  /i}  =  (1  —  r)  AL  +  /i.  Substituting  for  these  equalities 
in  program  (12),  it  follows  immediately  that  G  =  0,  and  the  strict  convexity  of  n  (■)  again 
ensures  the  uniqueness  of  the  solution  to  (11).    ■ 

The  main  conclusion  from  both  of  these  benchmark  political  environments  is  that  the 
politically  decisive  agents  choose  a  policy  vector  consistent  with  their  own  interests,  and  this 
always  involves  an  efficient  organization  of  the  state,  i.e.,  It  =  1  for  all  t  >  0.  There  is  no  reason 
to  make  the  state  inefficient.  Consequently,  both  consolidated  democratic  and  nondemocratic 
regimes  involve  1  =  1.  Moreover,  in  both  regimes  the  capacity  of  the  state  is  fully  utilized 
in  the  sense  that  constraint  (2)  holds  as  equality  and  the  minimum  number  of  bureaucrats 
necessary  to  prevent  tax  evasion  are  employed. 

It  is  straightforward  to  see  that  the  unique  solution  (rD,G°)  in  (9)  involves  td  >  0, 
since  infrastructure  spending,  K  >  0,  has  to  be  financed  (and  for  the  same  reason,  rN  >  0 
in  Proposition  2).  However,  because  raising  further  revenues  involves  the  employment  of 
bureaucrats  which  is  costly,  it  is  possible  that  the  solution  to  (9)  involves  GD  —  0.  If  this  were 
the  case,  there  would  be  no  difference  between  the  political  bliss  points  of  poor  and  rich  agents 
given  in  Propositions  1  and  2  and  thus  no  interesting  political  conflict.  Therefore,  throughout 
we  are  more  interested  in  the  case  where  the  following  condition  is  satisfied: 

Condition  1   The  solution  to  (9)  involves  GD  >  0. 

It  can  be  verified  that  if  the  gap  between  A  and  AL  is  small  and  n'  (t)  is  large,  this 
condition  will  be  violated.  Therefore,  this  condition  imposes  that  there  is  a  certain  degree  of 
inequality  in  society  and  raising  taxes  is  not  excessively  costly,  so  that  the  poor  would  like 
a  higher  level  of  public  good  provision  than  the  rich.    When  Condition  1  is  satisfied,  it  also 
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follows  that  rD  >  tn ,  and  since  n  (•)  is  strictly  increasing,  n  (td)  >  it  (tn)  and  the  size  of 
the  bureaucracy  is  larger  in  permanent  democracy  than  in  permanent  nondemocracy. 

4.4     Political  Equilibrium  with  Regime  Change 

We  now  look  at  the  more  interesting  case  with  regime  change — i.e.,  where  at  date  t  =  0,  the 
rich  are  in  power  and  from  then  on  there  will  be  elections.  We  start  with  a  series  of  lemmas. 
Our  first  result  shows  that  with  efficient  state  institutions,  the  rich  will  choose  their  political 
bliss  point  as  in  Proposition  2: 

Lemma  5  In  an  MPE,  if  dt  =  R  and  It  =  1,  then  wt  =  (l-  tn)  Al  +  h,  Xt  =  it  (tn), 
Gt  —  GN  =  0,  and  t     is  given  by  (11). 

Proof.  Given  that  It  =  1,  the  solution  to  the  equivalent  of  program  (12)  in  the  proof  of 
Proposition  2  for  party  R  involves  choosing  the  policy  vector  wt  =  (l  —  tw)  Al  +  h,  Xt  = 
n(rN),Gt  =  GN  =  0.    m 

The  next  lemma  establishes  that  the  party  representing  the  poor,  party  P;  being  elected 
to  office  is  an  "absorbing  state,"  meaning  that  once  the  party  of  the  poor  is  elected,  the  results 
of  Proposition  1  apply  subsequently: 

Lemma  6  If  dt  =  P,  then  dt>  =  P  for  all  t'  >  t,  and  we  have  the  following  equilibrium  policy 
vector  at  all  dates  t'  >  t: 

It     =    1,    wt  =  (l-TD)AL  +  h,    Xt  =  ir(TD),  (13) 

Gt    =    GD  =  (l-n)TDAH+[n-n{TD)]TDAL-[{l-TD)AL  +  h]7T(TD)-K, 

and  rD  is  given  by  (9). 

Proof.  The  policy  vector  in  (13)  is  the  optimal  policy  of  the  citizens  in  permanent  democ- 
racy (Proposition  1).  Now  suppose  that  party  P  is  in  power  at  time  t,  and  suppose  that  it 
chooses  the  policy  vector  specified  in  the  lemma.  Since  this  includes  It  =  1,  the  following 
period,  we  start  with  It  =  1  as  part  of  the  payoff-relevant  state  vector.  Suppose  that  a*  is  such 
that  v  (I  =  1  |  B)  =  P.  Then  party  P  wins  the  majority  at  time  t  + 1.  Alternatively  suppose 
that  v  (I  =  1  |  B)  ^  P,  but  X  <  n  — 1/2.  Then,  party  P  again  wins  the  majority  at  time  t  +  1. 
In  both  cases,  repeating  this  argument  for  the  next  period  shows  that  party  P  keeps  power  at 
all  dates  and  establishes  the  lemma. 

To  complete  the  proof  we  only  need  to  rule  out  the  case  where  v  (I  =  1  |  B)  =  R  and 
X  >  n  —  1/2  (the  proof  to  eliminate  the  case  where  bureaucrats  randomize  between  the  two 

19 


parties  in  a  way  to  bring  party  R  to  power  is  identical).  Since  v  (I  =  1  |  B)  =  R  and  1  =  0 
is  costly  for  the  rich  (recall  program  (12)  in  the  proof  of  Proposition  2),  party  R  will  choose 
It  =  1.  Then  from  Lemma  5, 

wt=(l-rN)AL  +  h,    Xt  =  r(TN),    Gt  =  GN  =  0. 

This  implies  that  the  utility  of  the  bureaucrat  is  the  same  as  a  poor  producer.  Then  denoting 
the  utility  of  a  bureaucrat  supporting  party  d  by  VB  (d),  we  have 

VB(R)     =     (1  -  tn)  AL  +  PVJ  (a*) 

<     (1-td)Al  +  Gd  +  I3V3{cj*) 
=    VB(P), 

where  the  inequality  follows  from  the  fact  that  the  last  term  is  the  maximal  utility  of  a  poor 
agent.  Since  this  is  also  the  utility  that  a  bureaucrat  will  receive  when  party  P  is  in  power, 
v  (I  =  1  |  B)  =  R  cannot  be  a  best  response,  completing  the  proof  of  the  lemma.    ■ 

The  intuition  for  this  result  is  as  follows.  Once  the  party  of  the  poor  wins  an  election,  they 
will  choose  their  preferred  policy  vector,  which  includes  It  =  1,  and  given  an  efficient  state, 
bureaucrats  will  have  no  reason  to  support  the  rich  party  and  the  poor  will  continue  to  win 
elections  in  all  future  periods  and  the  organization  of  the  state  will  continue  to  be  efficient. 
An  efficient  organization  of  the  state  ensures  that  bureaucrats  receive  no  rents  and  receive  the 
same  payoff  as  poor  producers.  Thus  they  will  also  support  party  P,  and  the  political  bliss 
point  of  the  poor  will  be  implemented  in  all  future  periods.  This  lemma  also  implies  that  when 
it-i  =  1 — i-e.,  when  the  state  is  efficient — the  rich  will  not  be  able  to  win  a  majority.  This 
is  related  to  the  basic  idea  of  our  approach:  the  rich  can  only  convince  bureaucrats  to  vote 
for  their  party  by  committing  to  giving  them  rents  and  this  can  only  be  achieved  when  the 
organization  of  the  state  is  inefficient,  i.e.,  It-\  —  0. 

We  next  investigate  whether  or  not  the  rich  may  be  able  to  convince  the  bureaucrats  to 
vote  for  their  party  starting  with  It-\  —  0.  Since  there  is  no  commitment  to  policies,  the  party 
of  the  rich,  when  in  power,  will  choose  policies  in  line  with  its  (the  rich  agents')  preferences. 
The  next  lemma  characterizes  these  policies  starting  with  It-\  =  0. 

Lemma  7  Suppose  that  It-\  =  0,  then  wt  =  h/qo.   Moreover,  if  dt  —  R,  then  Gt  =  G     =  0, 
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and  if  dt  =  P,  then  Gt  =  GD  given  by  the  solution  to  the  following  maximization  ■program: 

max  (1  -  r)  AL  +  G  (14) 

t,G  K  '  v     ' 

subject  to 

G    =    (l-n)TAH  +  [n-ir(T)]TAL-.—ir(T)-K. 

Qo 

Proof.  That  any  party,  when  in  power  and  inheriting  It~\  —  0,  will  choose  iot  =  h/qo 
follows  immediately  from  Lemma  3  (otherwise,  the  investment  in  infrastructure  K  cannot  be 
financed  and  there  will  be  zero  production).  The  fact  that  party  R  will  choose  Gt  =  GE  =  0 
follows  immediately  from  the  program  in  (12)  after  imposing  u>t  =  h/qo.  To  see  that  party  P 
will  choose  GD  as  in  (14),  it  suffices  to  go  back  to  the  maximization  problem  (10),  with  the 
additional  restriction  that  u>t  =  h/qo.    m 

Remark  3  As  with  the  solution  to  the  maximization  problem  (9),  the  solution  to  (14)  may 
involve  GD  =  0.  With  the  same  reasoning  as  there,  when  the  level  of  inequality  between  the 
rich  and  the  poor  is  sufficiently  high,  the  solution  to  the  program  (14)  will  involve  GD  >  0. 

The  next  lemma  provides  necessary  conditions  for  the  party  of  the  rich  to  win  an  election 
starting  with  It-\  —  0: 

Lemma  8  In  an  MPE,  dt  =  R,  i.e.,  the  rich  will  win  the  election  at  time  t,  if  It-i  =  0, 

(1  -  q0)  -  >  (1  -  rD)  AL  +  GD  +  ^GD,  (15) 

<?0  P 

and 

Xt>n-\,  (16) 

where  GD  is  given  by  (8),  GD  is  given  by  (14),  and  td  is  given  by  (9). 

Proof.  Lemma  6  establishes  that  It-i  =  0  is  necessary.  Now  suppose  that  It-i  =  0  and 
consider  the  scenario  in  which  party  R  chooses  It  —  0  and  Xt  >  Xt-\  (so  that  no  current 
bureaucrat  will  be  fired).  Consider  the  case  in  which  individual  j  €  Xt  is  pivotal  and  chooses 
v\  =  R  in  all  future  periods.  Then,  his  net  per-period  payoff  will  be  Wt  —  h  =  (1  —  qo)  h/qo, 
and  give  him  a  lifetime  utility  of 

Vi  -  -i-il^^.  (17) 

1-p       go 

In  contrast,  if  j  &  Xt  were  to  choose  v\  =  P  when  pivotal,  his  value  would  be 

Vi  =  ±-h+GD+pvi+1.  (18) 
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where  Vt3+l  is  the  continuation  value  when  party  P  is  in  power  from  then  on,  given  by 

This  last  expression  incorporates  the  fact  that  if  the  poor  are  in  power,  they  reform  the 
bureaucracy,  setting  It  =  1,  and  that  /  —  1  is  an  absorbing  state. 

The  comparison  of  (17)  and  (18)  gives  (15) — as  a  weak  inequality — as  a  necessary  condition 
for  bureaucrats  to  support  party  R  when  they  are  pivotal.  Condition  (16)  is  also  necessary 
since,  if  it  were  violated,  bureaucrats  would  not  be  pivotal  and  party  R  would  receive  less  than 
half  of  the  votes  even  with  all  of  bureaucrats  voting  v\  =  R.  This  argument  establishes  that 
both  (15)  and  (16)  are  necessary.  Moreover,  (15) — as  a  strict  inequality — and  (16)  are  also 
sufficient  to  ensure  dt  =  R,  since  when  both  of  these  conditions  hold,  it  is  a  weakly  dominant 
strategy  for  bureaucrats  to  vote  for  party  R  whenever  It-i  =  0  and  the  coalition  of  bureaucrats 
and  the  rich  have  a  majority.    ■ 

Lemma  8  determines  the  conditions  under  which  the  bureaucrats  will  support  party  R 
(a  rich  agent  running  for  office)  and  will  be  numerous  enough  to  give  them  the  majority. 
Condition  (16)  requires  the  size  of  the  bureaucracy  to  be  sufficient  to  give  the  majority  to 
party  R  when  all  bureaucrats. vote  with  the  rich.  Nevertheless,  n  —  1/2  may  not  be  the  actual 
size  of  bureaucracy.  In  particular,  at  X  —  n  —  1/2,  the  government  budget  may  not  balance. 
To  ensure  that  it  does,  we  need  to  consider  two  cases  separately. 

Let  us  first  define  rE  as  the  tax  rate  that  party  R  would  choose  as  its  unconstrained  optimal 
policy  to  finance  the  investment  in  infrastructure,  K,  given  that  bureaucratic  wages  are  equal 
to  w  —  h/qo-  Clearly,  rE  is  given  by  the  unique  solution  to  the  equation 

it  (rE)  -  -  (1  -  n)  rEAH  -  [n  -  vr  (rE)}  rEAL  +  K  =  0.  (19) 

In  other  words,  te  balances  the  government  budget  when  the  minimum  number  of  bureaucrats 
necessary  to  avoid  tax  evasion,  X  =  it  (te)  ,  are  employed. 

The  first  case  corresponds  to  the  one  where  it  {te)  >  n  —  1/2,  so  that  the  unconstrained 
optimal  size  of  bureaucracy  for  party  R  is  also  sufficient  to  make  sure  that  condition  (16)  is 
satisfied  and  the  rich  have  a  majority. 

The  second  case  applies  when  this  inequality  does  not  hold,  i.e.,  when  -it  {te)  <  n  —  1/2. 
In  this  case,  the  unconstrained  optimal  policy  for  the  rich  would  not  satisfy  (16),  and  party 
R  cannot  win  the  election  with  the  minimum  number  of  bureaucrats.  Instead,  party  R  can 
win  an  election  only  if  X  >  n  —  1/2,  and  with  this  larger  size  of  bureaucracy,  budget  balance 
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requires  the  greater  tax  rate  f     given  by  the  solution  to 

l\h  ■-  *     -E    AH  !-B      :. 


{l-n)faA"  -  =-t* Ah  +  K  =  0.  (20) 

2/  go  2 

It  can  be  verified  that  whenever  n  —  1/2  >  n  (rE),  we  also  have  f     >  r£,  and  whenever 

n  —  1/2  <  7r  (t-5),  tE  <  te .  This  implies  that  the  size  of  the  bureaucracy  necessary  for  the 

rich  to  form  a  winning  coalition  is  the  maximum  of  tt  (te)  and  n  —  1/2,  and  correspondingly, 

the  tax  rate  that  party  R  needs  to  set  is  max  {'r   ,  f    }. 

The  results  so  far  have  provided  the  necessary  conditions  for  the  rich  to  be  able  to  generate 

sufficient  votes  from  the  bureaucrats  to  remain  in  power.  It  remains  to  check  whether  the  rich 

prefer  to  pursue  this  strategy  and  commit  to  an  inefficient  state  in  order  to  maintain  political 

power  in  democracy.  The  following  lemma  answers  this  question: 

Lemma  9  Suppose  that  condition  (15)  holds.  Then  the  rich  prefer  to  set  It  =  0  for  all  t  if  the 
following  condition  is  satisfied: 

either  te  >  fE,      and  (td  -  te)  Ah  >Gd, 

E  (l-fE)AH>  ^ 

orrE<r    ,         and      {1  _  0)  ^  _  ^  AH  +  p  ^  _  TD^  AH  +  GD^ 

where  GD  is  given  by  (8),  td  is  given  by  (9),  te  is  given  by  (19),  and  fE  is  given  by  (20). 

Proof.  Suppose  that  bureaucrats  play  v  (I  —  0  |  B)  —  R  (that  is,  they  will  vote  for  party 
R  whenever  the  state  is  inefficient).  Under  the  rule  of  party  P,  the  per  period  return  of  the 
rich  is  (l  -  rD)  AH  +  GD.  When  te  >  fE,  party  R  can  remain  in  power  by  choosing  7"  =  0 
and  obtain  the  per  period  return  (l  —  rE)  AH ,  which  establishes  the  first  part  (21). 

For  the  second  part,  note  that  party  R  can  always  choose  its  myopic  optimum  when  in 
power.  This  will  give  a  representative  rich  agent  utility 

VR  =  (1  -  rE)  AH  +  -L-  [(1  -  td)  AH  +  GD]  . 

Here  (l  -  te)  Ah  is  current  consumption,  and  (3  [(l  -  td)  Ah  +  GD]  /  (1  —  /?)  is  the  contin- 
uation value,  which  follows  from  the  observation  that  since,  by  assumption,  rE  <  tE,  we  have 
n  —  1/2  >  7r  (rE)  and  thus  party  R  will  lose  the  election  at  the  next  date.  Then  Lemma  6 
implies  that  party  P  will  win  all  elections  in  all  future  dates.  Alternatively,  party  R  can  choose 
X  =  n  —  1/2  and  guarantee  to  be  in  power  forever,  but  at  the  expense  of  taxing  the  rich  at 
the  higher  rate  f    .  This  will  give  a  representative  rich  agent  utility 

yn       (1  -  if)  A" 

1-/3       ' 
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Comparison  of  VR  with  VR  in  the  previous  expression  gives  the  second  part  of  (21).    ■ 

Remark  4  If  Condition  1  were  not  satisfied,  the  conditions  in  Lemma  9  could  never  be  sat- 
isfied. In  particular,  when  Condition  1  does  not  hold,  we  have  GD  —  0  and  td  =  rE,  so 
that  neither  part  of  condition  (21)  could  hold.  This  is  a  direct  consequence  of  the  fact  that  a 
significant  conflict  in  policies  between  the  rich  and  the  poor  is  necessary  for  the  rich  to  set  up 
an  inefficient  system  of  patronage  politics. 

Now  putting  all  these  lemmas  together  we  obtain: 

Proposition  3  Consider  the  political  environment  with  emerging  democracy.  If  conditions 
(15)  and  (21)  hold,  then  there  exists  a  unique  MPE.  In  this  equilibrium,  the  rich  elite  choose 
It  =  0  for  all  t  >  0,  the  rich  party  R  always  remains  in  power  and  the  following  policies  are 
implemented: 

wt     =     — ,    Xt  =  max{7r(r£)  ,n-  1/2}  , 

go 

Gt     =    GE  =  0,  andrt  =  max  {tE,te}  , 

where  te  is  given  by  (19)  and  f     is  given  by  (20). 

If,  on  the  other  hand,  one  or  both  of  conditions  (15)  and  (21)  hold  with  the  reverse  inequal- 
ity, the  unique  MPE  involves  It  =  1  in  the  initial  period,  and  for  all  t  >  1 ,  dt  =  P  and  the 
unique  policy  vector  is 

wt     =     {l-rD)AL  +  h,    Xt  =  n(TD), 

Gt     =     GD  =  {l-n)rDAH  +[n-n(TD)}TDAL-[(l-TD)AL  +  h}n(TD)-K, 

and  td  is  given  by  (9). 

Proof.  The  first  part  of  the  proposition  follows  immediately  from  combining  Lemma  8 
and  Lemma  9,  which  provide  the  conditions,  summarized  by  (15)  and  (21),  under  which  the 
party  of  the  rich,  R,  can  convince  the  bureaucrats  to  vote  for  them,  and  this  is  desirable  for 
the  rich  relative  to  living  under  the  rule  of  party  P.  When  (15)  or  (21)  does  not  hold,  then 
party  P  is  in  power  and  the  second  part  of  the  proposition  follows  immediately  from  Lemma 
6  and  Proposition  1.    ■ 

Remark  5  Proposition  3  does  not  cover  the  case  in  which  one  of  conditions  (15)  and  (21) 
holds  as  equality;  in  this  case  the  MPE  is  no  longer  unique.  It  is  straightforward  to  see  that 
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in  such  a  case,  either  the  rich  or  the  poor  party  could  receive  the  majority  of  the  votes,  or  the 
rich  could  be  indifferent  between  maintaining  an  inefficient  and  an  efficient  state.  We  do  not 
describe  the  equilibrium  in  these  cases  to  avoid  repetition  and  to  save  space. 

Remark  6  It  can  also  be  verified  that  the  set  of  parameter  values  where  It  —  0  emerges  as  an 
equilibrium  in  Proposition  3  is  nonempty.  A  straightforward  way  of  doing  this  is  to  consider 
high  values  of  j3  as  in  the  proof  of  Proposition  5  in  the  Appendix. 

Proposition  3  is  our  first  major  result.  It  establishes  the  possibility  that  the  rich  elite, 
who  are  in  power  temporarily  at  time  t  =  0,  may  choose  an  inefficient  state  organization 
and  a  large  (inefficient)  bureaucracy  as  a  way  of  credibly  committing  to  providing  rents  to 
bureaucrats.  This  enables  them  to  create  a  majority  coalition  consisting  of  themselves  and 
the  bureaucrats,  and  thus  capture  democratic  politics.  This  coalition  implements  policies 
that  support  low  redistribution  and  low  provision  of  public  goods,  but  creates  high  rents  for 
bureaucrats.  Perhaps  more  interestingly,  after  t  =  1,  even  when  the  society  is  democratic,  the 
inefficient  state  institutions  persist  and  the  rule  of  the  rich  continue.  This  is  in  spite  of  the 
fact  that  at  any  date  these  inefficient  institutions  can  be  reformed  at  no  cost  and  made  more 
efficient.  The  reasoning  is  related  to  the  formation  of  the  coalition  between  the  rich  and  the 
bureaucrats  in  the  first  place.  The  rich  realize  that  they  will  be  able  to  maintain  power  only 
by  keeping  an  inefficient  state  structure  and  creating  sufficient  rents  for  bureaucrats.  If  these 
rents  disappear,  bureaucrats  will  ally  themselves  with  the  poor,  since  their  net  income  will  be 
the  same  as  the  net  income  of  poor  producers  (recall  Lemmas  5  and  6).  It  is  precisely  the 
presence  of  inefficient  state  institutions  creating  rents  for  the  bureaucrats  that  induces  them  to 
support  the  policies  of  the  rich.  Recognizing  this,  when  in  power  the  rich  choose  to  maintain 
the  inefficient  state  structure.  At  the  next  date,  the  party  representing  the  rich  receives  the 
support  of  the  bureaucrats  and  the  rich;  consequently,  the  rich  remain  in  power  and  the  cycle 
continues.  The  model  therefore  generates  a  political  economy  theory  for  both  the  emergence 
and  the  persistence  of  inefficient  state  institutions.14 

It  is  also  noteworthy  that  even  though  taxes  are  lower  in  the  equilibrium  with  inefficient 
state  than  they  would  have  been  under  permanent  democracy  (recall  Proposition  1  and  Lemma 
9),  the  size  of  the  bureaucracy  can  be  greater  than  under  permanent  democracy.  This  could  be 
the  case  when  the  rich  elite  hire  more  bureaucrats  than  necessary  for  preventing  tax  evasion 


14The  nature  of  persistence  here  is  different  from  the  persistence  of  policies  arising  in  Coate  and  Morris  (1999), 
Hassler  et  al.  (2003),  or  Gomes  and  Jehiel  (2005),  because  the  focus  is  not  on  persistence  of  a  certain  set  of 
collective  decisions  within  a  given  institutional  framework,  but  on  the  persistence  of  the  inefficiency  of  state 
institutions. 
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in  order  to  create  a  majority  in  favor  of  the  persistence  of  the  inefficient  state — i.e.,  in  the  case 
where  X  >  tt  (t£)  .  In  particular,  note  that  bureaucracy  will  be  more  numerous  under  the 
control  of  the  elite  than  in  democracy  whenever 

tt  (td)  <n-  1/2. 

Since  in  this  case  equation  (21)  implies  that  te  <  td,  we  must  also  have  n  (rE)  <  tt  (td)  < 
n  —  1/2  and  thus 

X  >  n  (te) . 

Consequently,  the  rich  not  only  choose  an  inefficient  state  organization,  but  they  also  choose 
overemployment  of  bureaucrats,  in  the  sense  that  bureaucracy  is  now  unnecessarily  large  and 
the  number  of  bureaucrats  is  strictly  greater  than  that  necessary  for  tax  inspection.  The 
capture  of  democratic  politics  by  the  rich  elite  therefore  creates  an  inefficient  state,  with 
poorly  monitored  and  overpaid  bureaucrats,  and  also  leads  to  a  situation  in  which  the  capacity 
of  the  state  is  not  fully  utilized.  These  inefficiencies  imply  that  the  allocation  of  resources  in 
a  captured  democracy  is  worse  than  in  a  nondemocracy  (or  than  in  a  perfectly  functioning 
democracy).  Naturally,  these  inefficiencies  have  a  political  rationale,  which  is  to  increase  the 
number  of  bureaucrats  that  will  vote  for  the  party  aligned  with  the  rich,  so  that  the  rich  can 
maintain  political  power  in  the  future. 

Interestingly,  because  creating  an  inefficient  bureaucracy  is  more  costly  than  creating  an 
efficient  one  (which  is  smaller  and  gives  bureaucrats  no  rents),  the  citizens  are  worse  off  in 
a  nonconsolidated  (emerging)  democracy,  where  they  are  taxed  at  rate  max {te,t  \,  than 
they  would  be  under  a  consolidated  nondemocracy,  where  they  are  only  taxed  at  rate  tn  < 
max  {r  ,t  }.  Moreover,  the  rich  are  also  worse  off  in  this  equilibrium  than  they  would  be  in 
a  permanent  nondemocracy,  since  they  are  paying  higher  wages  to  bureaucrats  and  possibly 
employing  an  excessive  number  of  them. 

4.5     Comparative  Statics 

We  next  investigate  the  conditions  under  which  the  equilibrium  involves  the  emergence  and 
persistence  of  inefficient  state  institutions.  The  following  proposition  establishes  that  a  certain 
degree  of  inequality  between  the  poor  and  the  rich  (i.e.,  a  high  level  of  A /AL),  a  sufficiently 
high  discount  factor,  /?,  and  intermediate  bureaucratic  rents,  (1  —  qo)  h/qo,  are  necessary  for 
the  emergence  of  inefficient  state  institutions. 

Proposition  4  Consider  an  economy  characterized  by  the  parameters  (f3,n,AL,AH,K,h,qo) 
and  the  function  p  (•) .  Holding  all  other  parameters  constant,  we  have: 
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1.  there  exists  a  >  1  such  that  if  AH /AL  <  a,  then  the  state  is  always  efficient,  i.e.,  It  =  1; 

2.  there  exist  a'  >  1  and  0  E  (0, 1)  such  that  as  long  as  AH /AL  >  a' ,  ft  <  0  implies  It  =  1; 

3.  there  exists  9>0  and  9  such  that  if  (1  —  go)  h/qo  £  (H,6),  then  It  =  l. 

Proof.  For  the  first  part  simply  recall  Remark  4;  inspection  of  the  maximization  problem 
(9)  immediately  shows  that  as  AL  — >  AH ,  Condition  1  will  be  violated  and  the  conditions  in 
(21)  cannot  hold.  Then  the  result  follows  from  Lemma  9  and  Proposition  3. 

For  the  second  part,  recall  from  Remark  3  that  some  minimal  level  of  inequality,  say 
AH  jAL  >  a',  is  necessary  for  GD  >  0.  Suppose  this  is  the  case.  From  Proposition  3,  condition 
(15)  is  necessary  for  It  =  0.  Since  GD  >  0,  there  exists  0O  £  (0, 1)  such  that  (1  —  go)  h/qo  = 
0OGD /  (1  —  0O).  Since  the  sum  of  the  other  terms  on  the  right  hand  side  of  (15)  is  positive, 
this  implies  that  there  exists  0  <  0O  such  that  for  all  0  <  0  (15)  will  be  violated  and  thus 

It  =  l- 

For  the  third  part,  note  that  bureaucratic  rents  are  equal  to  h/qo  —  h  —  (1  —  go)  h/qo,  which 
needs  to  be  greater  than  or  equal  to  the  right  hand  side  of  (15).  Let  this  right  hand  side  be 
denoted  by  9  (and  note  that  9  >  0).  If  (1  —  go)  h/qo  <  9,  then  (15)  will  be  violated  and  It  =  1. 
This  implies  that  we  need  (1  —  go)  h/qo  >  9  >  0.  Next  observe  from  (19)  that  there  exists  a 
value  of  (1  —  go)  h/qo,  say  #0i  such  that  te  =  1-  It  is  evident  that  when  te  —  1,  condition 
(21)  cannot  be  satisfied,  thus  It  =  1.  This  implies  that  for  /(  =  0,  we  need  h/qo  <  9q  and  thus 
(l-qo)h/qo<9.    U 

The  first  part  of  the  proposition  implies  that  a  certain  level  of  inequality  is  necessary  for 
the  emergence  of  an  inefficient  state.  This  is  intuitive;  with  limited  inequality,  democracy 
will  not  be  redistributive  and  it  will  not  be  worthwhile  for  the  rich  to  set  up  an  inefficient 
bureaucracy  in  order  to  keep  the  poor  away  from  power.  The  second  part  implies  that  the 
high  discount  factor  is  also  necessary  for  the  emergence  of  the  inefficient  state.  This  follows 
because  bureaucrats  vote  for  party  R  as  an  "investment" ,  that  is,  to  obtain  higher  returns  in 
the  future.  Instead,  if  they  deviate  and  vote  for  party  P,  in  the  current  period  they  receive 
both  the  same  high  wages  (since  It  =  0)  and  the  positive  level  of  public  good  provided  by  party 
P,  GD  >  0.  If  their  discount  factor  were  very  small,  it  would  be  impossible  for  rich  agents 
to  convince  bureaucrats  to  support  their  party.15  Finally,  the  third  part  of  the  proposition 
implies  that  bureaucratic  rents  need  to  take  intermediate  values.  If  bureaucratic  rents  are  very 


10  Robinson  (2001)  and  Acemoglu  and  Robinson  (2006b)  also  obtain  the  result  that  higher  discount  factors 
may  lead  to  greater  inefficiencies.  However,  in  these  models  the  source  of  inefficiency  is  very  different.  In 
particular,  inefficient  political  equilibria  arise  when  pivotal  agents — elites  or  rulers — are  sufficiently  patient  and 
thus  take  inefficient  actions  in  order  to  secure  their  future  political  survival. 
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small,  bureaucrats  would  not  support  the  party  of  the  rich.  If  they  are  very  large,  it  becomes 
prohibitively  costly  for  the  rich  to  control  democratic  politics. 

While  Proposition  4  shows  that  a  certain  degree  of  inequality  is  necessary  for  It  =  0,  it  does 
not  establish  that  inequality  has  a  monotonic  effect  on  the  likelihood  of  an  inefficient  state. 
The  next  proposition  establishes  this  result  under  somewhat  more  restrictive  assumptions. 
In  this  proposition,  by  greater  inequality  we  mean  a  mean-preserving  spread  of  the  income 
distribution  in  the  economy,  i.e.,  a  simultaneous  increase  in  AH  and  decrease  in  AL  such  that 
mean  income,  Y  =  (1  —  n)  AH  +  nAL  remains  constant. 

Proposition  5  Suppose  that  ix  (r)  is  log-concave  in  r  and  td  given  by  (9)  satisfies  rD  < 
1  —  7T  (t  )  <  1.  Then  there  exists  /3  G  (0, 1)  such  that  for  all  j3  >  /3,  greater  inequality  makes 
the  inefficient  state  equilibrium,  i.e.,  It  =  0,  more  likely. 

Proof.  See  the  Appendix.    ■ 

Remark  7  The  condition  that  n  (r)  is  log-concave  is  not  very  restrictive.  For  example,  any 
p  (x)  that  takes  the  power  function  form,  i.e.,  p  (x)  =  Pqx01  for  Po  >  0  and  a  G  (0, 1),  satisfies 
this  condition.  The  condition  that  td  <  1  —  tt  (rD)  <  1  is  also  natural;  if  this  condition  were 
violated,  we  would  have  that  the  utility  of  the  poor  in  democracy  (l  —  td)  Al  +  Gd  would  be 
non-increasing  in  AL  (see  the  Appendix). 

In  addition  to  generalizing  the  first  part  of  Proposition  4,  this  result  implies  that  taxes  (and 
public  spending)  can  be  higher  in  more  equal  societies,  because  unequal  societies  are  more  likely 
to  create  inefficient  bureaucracies  to  limit  taxation  and  public  spending.  This  result  therefore 
presents  an  alternative  explanation  to  the  often-discussed  negative  cross-sectional  correlation 
between  inequality  and  redistribution  (e.g.  Perotti,  1996,  Benabou,  2000). 

5      Extensions 

In  this  section,  we  discuss  a  number  of  extensions  of  our  benchmark  model.  First,  we  allow 
bureaucrats  to  be  fired  when  they  are  caught  shirking,  so  that  the  incentive  compatibility  con- 
straint of  bureaucrats  is  forward-looking  and  takes  into  account  the  rents  that  a  bureaucrat 
will  lose  when  he  gets  caught  not  exerting  effort.  Second,  we  allow  a  richer  political  environ- 
ment where  each  individual  can  run  for  office  (form  a  party)  as  a  citizen-candidate,  so  that 
bureaucrats  can  also  form  their  own  party  and  compete  against  the  party  of  the  poor  and  the 
rich.  Third,  we  consider  the  case  where  the  moral  hazard  problem  of  bureaucrats  arises  from 
their  temptation  to  accept  bribes  that  might  be  offered  by  taxpayers. 
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5.1     Equilibrium  When  Bureaucrats  Can  Be  Fired 

The  main  result  of  the  previous  section,  Proposition  3,  was  derived  under  the  assumption  that 
bureaucrats  cannot  be  fired  when  they  are  caught  shirking.  This  simplified  the  analysis  by 
enabling  us  to  write  the  incentive  compatibility  constraint  of  bureaucrats  in  the  simple  form 
of  condition  (5).  As  discussed  in  Remark  1  this  was  mainly  for  expositional  reasons.  We  now 
allow  bureaucrats  to  be  fired  when  they  are  caught  shirking.  It  is  clear  that  from  the  viewpoint 
of  discouraging  shirking,  a  contract  which  commits  to  firing  bureaucrats  when  they  are  caught 
shirking  is  optimal.  The  discussion  in  Remark  1  establishes  that,  in  a  stationary  equilibrium, 
the  incentive  compatibility  constraint  of  bureaucrats,  the  equivalent  of  (5),  in  this  case,  would 
be: 

w>P(l-r)AL+(1-p{1-^h.  (22) 

Given  this  condition,  all  of  the  results  from  the  previous  section  apply  with  appropriate  mod- 
ifications. In  particular  we  have  (proof  omitted): 

Lemma  10  Consider  the  environment  where  bureaucrats  can  be  fired  for  shirking.  Then  in 
any  MPE,  if  a\  =  R  and  It-i  =  0,  we  have  wt  =  fi  (l  -  fE)  AL  +  (1  -  (3  (1  -  q0))  h/q0  and 
Gt  =  GE  =  0,  where  f     is  the  solution  to 

-  (1  -  n)  teAh  -  [n  -  Xm]  fEAL  +  K  =  0      (23) 


A, 


Hl_fE)AL+(±-e(i-«o))h 


9o 
where  Xm  =  max  {n  (fE)  ,  n  —  1/2}  . 

Moreover,  we  have  the  following  generalization  of  Lemma  9  (proof  omitted): 

Lemma  11  Consider  the  environment  where  bureaucrats  can  be  fired  for  shirking.  Then  in 
any  MPE,  the  rich  will  win  the  election  at  time  t  only  if  there  is  an  inefficient  state,  i.e., 
It-i  =  0;  if  bureaucrats  prefer  to  support  the  party  of  the  rich,  i.e.,  if 

P  (1  _  rE)  AL  +  ^LlM-Mh  >  (J  _  TD)  AL  +  GD+  l-^GD,  (24) 

<70  P 

and  if  the  rich-bureaucrat  coalition  has  the  majority,  i.e.,  if 

*t>n-i  (25) 

where  f  is  given  by  (23),  GD  is  given  by  (8),  td  is  given  by  (9),  and  GD  is  given  by  the 
solution  to 

max(l-r)^lL  +  G 


T,G 

subject  to 

{l-n)rAH  +[n-n(T)]TA1 


/3(1.rE)AL  +  (l-l3(l-qo))- 

<7o 
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n  (r)  -  K. 


Furthermore,  the  rich  prefer  this  equilibrium  and  choose  It  =  0  at  time  t  only  if 

(td  -  rE)  AH  >  GD.  (26) 

These  two  lemmas  give  the  following  analogue  to  Proposition  3: 

Proposition  6  Consider  the  political  environment  with  emerging  democracy  and  suppose  that 
bureaucrats  can  be  fired  if  caught  shirking.  Then,  if  conditions  (24)  and  (26)  hold,  the  unique 
MPE  is  one  in  which  the  rich  elite  choose  It  =  0  in  the  initial  period  and  for  all  t  thereafter, 
the  rich  party  always  remains  in  power  and  the  following  policies  are  implemented  at  all  dates: 
Wt  =  [3  (1  -  fE)  AL  +  (1-[3{1-  q0))  h/qQ,  Xt  =  max  {n  (fE)  ,n-  1/2},  Gt  =  GE  =  0,  and 
Tt  —  f   ,  where  f     is  given  by  (23). 

If  one  or  both  of  conditions  (24)  and  (26)  hold  with  the  reverse  inequality,  the  unique  MPE 
involves  It  —  1  in  the  initial  period,  and  for  all  t  >  1 ,  a\  =  P  and  the  unique  policy  vector 
wt  =  (1  -  td)  AL  +  h,  Xt  =  vr  (td),  Gt  =  GD  =  (1  -  n)rDAH  +  [n  -  n  (td)}  tdAl  -K- 
[(1  -  td)  AL  +  h]  n  (td),  and  td  as  given  by  (9). 

Proof.  Combining  Lemma  10  and  Lemma  11  provides  the  conditions,  (24)  and  (26), 
under  which  the  party  of  the  rich,  R,  can  convince  the  bureaucrats  to  vote  for  them,  and  this 
is  desirable  for  the  rich  relative  to  living  under  the  rule  of  party  P.  When  instead  (24)  or  (26), 
or  both,  do  not  hold,  then  party  P  is  in  power  and  the  second  part  of  the  proposition  follows 
immediately  from  Lemma  6  and  Proposition  1.    ■ 

Proposition  6  demonstrates  that  the  main  results  from  Proposition  3  generalize  to  the 
environment  where  bureaucrats  can  be  fired  if  caught  shirking.  One  important  difference  is 
worth  noting,  however.  In  our  main  analysis,  Proposition  4  showed  that  a  higher  discount 
factor,  /?,  makes  the  emergence  of  an  inefficient  state  more  likely.  Instead,  when  bureaucrats 
can  be  fired,  the  relationship  between  the  discount  factor  and  the  emergence  of  inefficient  states 
is  more  complex.  Higher  (5  again  increases  the  importance  that  bureaucrats  attach  to  future 
rents,  but  it  also  reduces  the  level  of  rents,  because  being  fired  from  bureaucracy  becomes  more 
costly. 

5.2     Political  Equilibrium  Citizen- Candidates 

The  previous  analysis  limited  the  political  system  under  democracy  to  a  two-party  competition 
between  P  and  R,  the  two  parties  representing  the  interests  of  the  poor  and  the  rich.  We 
justified  this  by  assuming  that  bureaucrats  are  not  allowed  to  run  for  office.  Even  if  bureaucrats 
are  not  allowed  to  run  for  office,  it  is  possible  that  a  party  representing  their  interest  might 
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form.  If  such  a  party  forms,  bureaucrats  may  vote  for  that  party,  and  the  coalition  between 
the  rich  and  the  bureaucrats,  choosing  low  public  good  provision  and  low  taxes,  may  not 
materialize.  We  now  investigate  whether  in  general  we  expect  this  to  be  the  case  or  not  when 
multiple  parties  can  enter  the  political  system. 

We  follow  Osborne  and  Slivinski's  (1996)  and  Besley  and  Coate's  (1997)  citizen-candidate 
model,  where  each  individual  agent  can  run  as  a  candidate  and  upon  election  chooses  his  most- 
preferred  policy  vector.  This  setup  is  quite  similar  to  the  one  we  used  above,  since  parties  could 
not  make  credible  policy  promises  and  the  policy  vector  was  chosen  after  a  politician  (party) 
was  elected  office.  The  problem  with  the  citizen-candidate  models  in  general  is  that  when  more 
than  two  parties  compete,  coordination  among  the  citizens  regarding  which  party  has  a  chance 
to  win  the  election  is  important  for  the  outcomes  and  typically  lead  to  multiple  equilibria  in  the 
voting  stage.  To  avoid  these  problems,  we  consider  the  following  modification  of  the  standard 
citizen-candidate  model: 

1.  Each  individual  can  decide  to  form  a  party  and  run  for  office,  and  this  has  cost  e,  which 
is  taken  to  be  small  (in  particular,  we  will  consider  the  case  where  e  \  0).  Individuals  derive  no 
utility  from  coming  to  power,  but  simply  benefit  from  being  in  power  by  implementing  policies 
that  are  in  line  with  their  interests. 

2.  Given  all  parties  that  are  running  for  office,  individuals  vote  using  ballots  with  trans- 
ferable votes,  meaning  that  each  individual  ranks  all  parties  in  strict  order  of  preference.  In 
particular,  the  vote  of  individual  j  can  be  represented  as  \\  =  i^h,  where  i\,i2  and  £3  -are 
distinct  elements  of  {R,  P,  B},  e.g.,  vJt  —  RBP.  In  the  first  stage,  parties  are  allocated  votes 
according  to  the  first  preferences  of  the  voters.  Then  as  is  standard  with  this  type  of  voting 
rule,  the  party  that  gets  the  lowest  fraction  of  votes  is  eliminated,  and  its  votes  are  allocated 
to  the  second-ranked  choice  of  the  voters  who  had  originally  voted  for  this  party.  This  process 
continues  until  one  of  the  parties  has  a  majority. 

To  simplify  the  discussion,  in  this  section  we  assume  that  bureaucrats  cannot  be  fired  if 
caught  shirking,  so  the  incentive  compatibility  constraint  for  bureaucrats  is  given  by  (5) — 
though  this  has  no  effect  on  any  of  the  results  in  the  section. 

Given  this  setup,  the  notion  of  Markov  Perfect  Equilibrium  is  modified  accordingly.  The 
analysis  in  this  case  is  still  tractable  thanks  to  the  following  series  of  lemmas: 

Lemma  12    Truthful  ranking  is  a  weakly  dominant  strategy  for  each  individual. 

Proof.  The  transferable  votes  imply  that  at  any  stage  of  the  elimination  process,  either  an 
individual  is  pivotal,  has  a  choice  between  two  options,  and  thus  is  better  off  ranking  his  more 
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preferred  outcome  above  a  less  preferred  outcome.  Alternatively,  the  individual  is  not  pivotal, 
any  choice  is  a  best  response.  This  establishes  that  truthful  ranking  is  weakly  dominant.    ■ 

Lemma  13  In  any  MPE,  there  will  never  be  more  than  one  party  operated  by  an  individual 
of  the  same  group.   Thus  the  maximum  number  of  parties  is  three. 

Proof.  The  result  follows  since  the  policies  chosen  by  two  parties  run  by  two  poor  agents 
(or  two  rich  agent  or  two  bureaucrats)  will  be  identical.  Moreover,  from  Lemma  12,  each 
agent  ranks  parties  truthfully,  thus  the  addition  of  a  new  party  will  not  change  the  equilibrium 
probability  that  a  party  run  by  a  poor  individual,  a  rich  individual  or  a  bureaucrat  wins  the 
election.  Thus  conditional  on  a  party  run  by  a  poor  agent  existing,  there  is  no  point  for  any 
other  poor  agent  to  incur  the  cost  e  >  0  and  form  a  party.    ■ 

Lemma  13  then  enables  us  to  simply  look  at  the  (truthful)  preference  ranking  of  each 
individual  over  at  three  parties  {P,  R,  B},  corresponding  to  parties  run  by  a  poor  individual, 
a  rich  individual  and  a  bureaucrat  (there  is  no  source  of  confusion  in  this  notation,  since 
there  can  at  most  be  one  party  run  by  a  poor  agent,  one  run  by  a  rich  agent,  and  one  run 
by  a  bureaucrat).  To  do  this,  we  need  to  know  the  policies  that  will  be  chosen  by  the  three 
types  of  parties.  Our  previous  analysis  already  establishes  the  policies  that  will  be  chosen  by 
parties  P  and  R  (provided  that  party  R  is  trying  to  come  to  office  by  attracting  the  votes 
of  bureaucrats).  We  therefore  only  need  to  look  at  the  policy  choice  of  a  party  run  by  a 
bureaucrat.  The  following  lemma  characterizes  this  choice: 

Lemma  14   Taking  future  election  results  as  given,  the  party  dt  =  B  would  choose  the  following 
policy  vector:  Gt  —  0,  Xt  =  min  [Xt-i,n  (tb)},  and  (tb,wb)  such  that 

wjb     —     arg  max  w 
subject  to 
mm{Xt^,ir(T)}w  +  K     <     (1  -  n)  tAh  +  [n  -  min  {Xt_i,7r  (r)}]  tAl. 

Proof.  This  immediately  follows  by  writing  the  program  to  maximize  the  return  to  a 
bureaucrat  (without  allowing  firing  of  existing  bureaucrats): 

max     w  +  G 

T,w,Xt,I,G 
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subject  to 


min{Xt_i,7r(r.)}     <     X 

•iVtT 


in;is  !    -ttt,(1  -r)  AL  +  /i  I     <     to 


G    <     (1  -  n)  rAH  +  [n-X]  tAl  -  wX  -  K 
0    <    G. 


Intuitively,  bureaucrats  would  maximize  their  wages  subject  to  the  government  budget 
constraint.  Notice  that  Lemma  14  applies  taking  the  results  of  future  elections  as  given.  If  the 
current  bureaucratic  government  could  influence  the  outcome  of  future  elections,  this  could 
be  beneficial  for  it  only  by  increasing  Xt  above  min  {Xt_i,7r  (t  )},  which  would  (from  the 
government  budget  constraint)  make  this  policy  vector  even  less  attractive  to  poor  and  rich 
agents. 

The  key  to  the  results  in  this  section  is  the  following  observation:  because  a  bureaucratic 
government  will  maximize  wages  paid  to  bureaucrats  (and  provide  no  public  goods),  it  yields 
a  lower  utility  to  poor  agents  than  a  rich  government  would  do.  As  a  result,  we  will  see  that  a 
bureaucratic  government  will  never  get  elected.  To  show  this  more  formally,  let  us  denote  the 
vote  of  individual  j  at  time  t  by  Vj ,  which  is  a  ranking  over  {P,  R,  B}.  For  example,  vj  =  PRB 
means  that  the  individual  ranks  the  poor  party  first,  the  rich  party  second  in  the  bureaucratic 
party  last. 

We  now  have  the  following  rankings  for  individuals: 

Lemma  15  IfjEH,  then  v£  =  RPB. 
If.j  €  £  and  j  <£  X,  then  w\  =  PRB. 
If  j  6  X,  thenv{  =  BRP. 

Proof.  We  have  already  established  that  voters  rank  parties  truthfully,  so  that  all  voters 
rank  their  own  party  first.  Assuming  that  party  R  implements  the  policy  characterized  in 
Proposition  (3)  to  attract  the  bureaucrats,  we  have  that  bureaucrats  indeed  prefer  the  rich  to 
the  poor  as  second  choice;  hence,  if  j  €  X,  then  Vj  =  BRP.  Moreover,  the  poor  prefer  the  rich 
to  the  bureaucrats  since  neither  of  them  offers  any  public  good,  but  the  rich  tax  less  than  the 
bureaucrats.  This  follows  since  both  the  rich  and  the  bureaucrats  choose  to  finance  K,  and 
party  B  chooses  a  wage  wB  for  bureaucrats  higher  than  the  wage  h/qo  that  the  bureaucrats 
get  if  the  rich  are  in  power.  Hence,  if  j  G  C  and  j  £  X,  then  v^  =  PRB.  Finally,  the  second 
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choice  of  the  rich  is  for  the  poor,  both  because  the  poor  would  provide  a  positive  amount  of 
the  public  good  rather  than  zero  as  the  bureaucrats  would,  and  because  the  poor  would  tax 
less  then  the  bureaucrats  given  that  the  marginal  cost  of  taxation  for  the  poor  is  positive,  and 
zero  for  the  bureaucrats  (who  are  not  taxed  by  assumption).  It  follows  that  if  j  £  7i,  then 
vf  =  RPB.   n 

Lemma  15  implies  that  the  poor,  when  they  cannot  have  a  majority  by  themselves,' will 
support  the  rich  party,  thus  as  long  as  the  bureaucrats  are  not  in  majority  by  themselves,  i.e., 
Xt  <  1/2  and  the  rich  pursue  the  policy  in  Proposition  3,  we  will  have  dt  =  R.  This  implies 
that  the  rich  can  continue  to  use  same  political  strategies  as  in  the  previous  section  to  control 
political  decision-making  in  democracy. 

Now  combining  the  previous  lemmas,  we  have  the  following  proposition,  which  mirrors 
Proposition  3. 

Proposition  7  Consider  the  political  environment  with  emerging  democracy  and  free  political 
entry  by  citizen  candidates.  Suppose  e  [  0  that  and  that  conditions  (15)  and  (21)  and  it  (t-5)  < 
1/2,  where  te  is  defined  by  (19)  above.  Then,  in  any  MPE  of  the  citizen- candidate  political 
game,  only  a  party  run  by  a  rich  agent  is  active.  The  unique  equilibrium  policy  vector  is  given 
by  It  —  0,  wt  —  h/q0,  Xt  —  max  {71- (r^)  ,n-  1/2},  Gt  =  GE  =  0,  and  rt  =  max  {te,te}, 
for  all  t,  where  f     is  given  by  (20). 

If  one  or  both  of  conditions  (15)  and  (21)  holds  with  the  reverse  inequality  and  it  (td)  < 
1/2,  where  td  is  defined  by  (9),  then  the  unique  MPE  involves  only  a  party  run  by  a  poor 
agent  is  active,  and  the  unique  equilibrium  policy  vector  involves  It  =  1  for  all  t,  and  for  all 
t>l,VH=  {l-rD)AL  +  h,  Xt  =  n(TD),  Gt  =  GD  =  {l-n)TDAH  +  [n  -  it  (td)]  tdAl  - 
[(l-TD)AL  +  h]n{TD)-K. 

Proof.  This  proposition  can  be  proved  by  backward  induction.  First,  suppose  that  con- 
ditions (15)  and  (21)  hold.  Then,  from  Lemma  15,  when  X  <  1/2,  the  bureaucratic  party 
will  never  win  an  election.  The  assumption  that  X-\  =  0  implies  that  in  the  initial  period 
X-i  <  1/2,  and  the  assumption  that  tt  (te)  <  1/2  ensure  that  X  <  1/2  continues  to  be  the 
case  when  the  rich  party  is  in  power.  Therefore,  when  the  rich  party  is  in  power,  no  bureau- 
crat incurs  the  cost  e  j  0  to  form  a  party,  and  thus  bureaucrats  support  party  R  by  the  same 
argument  as  in  the  proof  of  Proposition  3.  Next,  knowing  that  bureaucrats  support  party  R, 
no  poor  agent  incurs  the  cost  s  [  0  to  compete  against  party  R  as  long  as  party  R  is  choosing 
the  policy  in  Proposition  3  (if  they  did  deviate  from  this  policy,  then  a  poor  party  can  win  an 
election,  and  thus  a  poor  agent  will  find  it  beneficial  to  enter  and  form  a  party  since  e  I  0; 
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thus  despite  the  fact  that  party  P  would  not  be  running,  party  R  has  to  adopt  the  same  policy 
vector  as  in  Proposition  3).  Finally,  since  e  J  0,  it  cannot  be  an  equilibrium  for  no  rich  agents 
to  form  a  party,  since  such  a  party  would  create  strictly  positive  gains  for  each  rich  agent,  and 
the  cost  of  creating  a  party  is  e  J  0. 

The  proof  of  the  cases  where  one  or  both  of  conditions  (15)  and  (21)  hold  with  the  reverse 
inequality  is  similar.    ■ 

This  proposition  therefore  shows  that  our  main  results  regarding  the  use  of  an  inefficient 
state  as  a  way  by  the  rich  elite  to  control  the  democratic  political  process  continue  to  apply 
even  when  the  political  structure  is  enriched  to  allow  free  entry  by  citizen-candidates  of  any 
occupation.  The  additional  insights  that  is  interesting  in  this  case  is  that  when  the  poor 
producers  prefer  to  support  the  party  of  the  rich,  R,  rather  than  the  party  of  the  bureaucrats, 
B,  since  the  latter  would  impose  high  taxes  and  provide  no  public  goods  (spending  all  the 
proceeds  on  bureaucratic  wages). 

5.3     Bureaucratic  Corruption 

We  now  briefly  discuss  an  extension  of  our  basic  model  in  which  the  moral  hazard  problem  on 
the  side  of  bureaucrats  is  not  related  to  their  effort,  but  to  whether  or  not  they  accept  bribes 
from  producers  evading  taxes.  This  source  of  moral  hazard  problem  is  arguably  as  important 
as  the  effort  choice  of  bureaucrats.  Moreover,  we  will  see  below  that  it  leads  to  an  interesting 
pattern  of  de  facto  regressive  taxation  as  a  result  of  successful  patronage  politics  by  the  rich 
elite. 

The  economic  and  political  environment  is  similar  to  the  baseline  version  of  the  model  with 
a  two-party  system.  The  only  difference  is  that  the  bureaucrats  no  longer  have  an  effort  choice. 
Instead,  producers  that  have  evaded  taxes  can  pay  a  bribe  b  >  0  to  the  bureaucrat  inspecting 
them  in  order  to  avoid  paying  taxes. 

Similar  to  the  baseline  model,  we  allow  for  two  levels  of  monitoring  efficiency,  described  by 
the  state  variable  if  6  {0, 1}.  When  /  =  1,  there  is  an  efficient  organization  of  the  state  and 
corruption  is  detected  with  probability  q(I  =  1)  =  1.  When  1  =  0,  the  state  organization  is 
inefficient  and  corruption  is  detected  with  probability  q  (I  =  0)  =  qo  <  1.  We  make  a  number  of 
assumptions  to  simplify  the  exposition.  First,  we  assume  that  a  bureaucrat  caught  accepting 
bribes  loses  his  wage  and  the  bribe,  but  the  punishment  is  limited  to  only  one  period;  the 
producer  paying  the  bribes  loses  the  bribe  but  receives  no  other  punishment.  Second,  all  bribe 
payments  and  other  income  confiscated  are  lost  and  thus  do  not  enter  the  government  budget 
constraint.  Third,  we  assume  that  after  matching  with  a  bureaucrat,  the  producer  has  all  the 
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bargaining  power  and  makes  a  take-it-or-leave-it  bribe  offer  to  the  bureaucrat.  All  of  these 
assumptions  can  be  relaxed  without  changing  our  main  results. 

Finally,  we  assume  that  each  bureaucrat  can  be  matched  with  at  most  one  producer  and 
that,  for  the  relevant  part  of  the  parameter  values,  p(x)  <  x/  (1  —  x).  Note  that  the  function 
p{x)  is  concave  while  x/  (1  —  x)  is  convex  and  both  are  equal  to  zero  for  x  —  0.  Therefore, 
there  is  a  range  for  x  G  [0,  xm]  such  that  p{x)  >  x/  (1  —  x).  We  assume  that  xm  is  lower  than 
the  minimum  size  of  the  bureaucracy  necessary  to  finance  the  infrastructure  K,  which  ensures 
that  the  region  where  p(x)  >  x/  (1  —  x)  is  irrelevant  for  the  equilibrium. 

Let  us  start  with  the  case  where  the  state  is  inefficient  so  that  q  —  go  and  characterize  the 
most  preferred  policies  of  the  rich.  The  participation  constraint  of  the  bureaucrat  is  slightly 
different  from  (6),  since  there  is  no  cost  of  effort.  It  requires  that 

wt>(l-Tt)AL.  (27) 

The  incentive  compatibility  constraint  for  bureaucrats  (5)  is  now  replaced  by  the  following 
"no  bribe  constraint" : 

wt  >  (1  -  go)  [wt  +  bt) ,  (28) 

where  bt  is  the  bribe  offered  to  the  bureaucrat  by  a  producer.  Intuitively,  the  right  hand 
side  of  (28)  represents  the  expected  return  of  a  bureaucrat  that  accepts  a  bribe  bu  given  by 
the  sum  of  the  wage  and  the  bribe,  weighted  by  the  probability  of  not  being  detected.  If 
condition  (28)  does  not  hold,  it  is  not  possible  to  prevent  the  corruption  of  bureaucrats  by 
producers.16  Condition  (28)  implies  that,  given  the  public  sector  wage  wt,  only  bribes  higher 
than  a  threshold  b  (wt)  will  be  accepted,  where 

b(w)  =  -^—w.  (29) 

In  what  follows,  we  drop  time  subscripts  to  simplify  notation.  When  in  power,  the  rich 
maximize  their  per-period  utility  with  respect  to  r,  w,  X,  G,  and  the  decision  variable  z  6 
{0, 1},  which,  as  before,  designates  their  decision  of  whether  to  pay  taxes.  The  expected  utility 
of  the  rich  when  they  do  not  pay  taxes  is 


uH  (z  =  0)  =  p  (x)  max  j  AH ^—  w,  0 1  +  [1  -  p  (s)]  AH  +  G. 

[  1  -  <?o        J 


(30) 


Expression  (30)  incorporates  the  following  facts:  (i)  producers  are  inspected  by  a  bureaucrat 
with  probability  p  (x);  (ii)  bribing  is  detected  with  probability  qo;  (iii)  the  bribe  offered  by  the 


:6Van  Rijckeghem  and  Weder  (2001)  provide  evidence  that  higher  public  sector  wages  relative  to  manufac- 
turing wages  reduce  the  scope  for  the  corruption  of  the  public  administration. 
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rich  to  bureaucrats  is  equal  to  the  lowest  acceptable  bribe  b(w)  =  qow/  (1  —  qo)  defined  in 
(29);  and  (iv)  when  inspected,  the  income  of  a  rich  producer  is  max  {A  —  qow/  (1  —  go)  ,  0}. 
Expression  (30)  is  maximized  subject  to  the  following  constraints 

p  Or)  max  /  AL  -  -S°—w,o\  +  [1  -  p  (x)]  AL  <  (1  -  r)  AL,  (31) 

I  1-90         J 

xw  +  G  +  K  <{u-x)tAl,  (32) 

and  subject  to  the  participation  constraint  (27)  of  the  bureaucrats.  Constraint  (31)  requires 
that  the  poor  prefer  to  pay  taxes  to  tax  evasion.  This  constraint  has  to  be  satisfied  since  at 
least  one  class  must  pay  taxes,  otherwise  it  would  not  be  possible  to  finance  the  infrastructure 
investment,  K  (this  is  because,  if  the  poor  prefer  to  evade  taxes,  the  rich  will  do  so  a  fortiori). 
Constraint  (32)  implies  that  the  government  budget  constraint  is  satisfied,  taking  account  of 
the  fact  that  public  revenues  come  from  the  taxation  of  the  poor  only. 

Lemma  16   Suppose  that  the  rich  prefer  not  to  pay  taxes.   Then  their  optimal  policies  involve 

WE=(l-qQ)AL/qQ,  (33) 

'p(x)  =  T,  (34) 

and  GE  =  0  for  some  r  G  [0, 1]. 

Proof.  See  the  Appendix.    ■ 

We  will  next  show  that  given  (33)'  and  (34),  the  equilibrium  involves  tax  evasion  by  the 
rich.  Substituting  for  these  expressions,  we  obtain  the  utility  of  the  rich  when  they  evade  taxes 
as 

uH  (z  =  0)  =  (1-t)Ah  +  t  (Ah  -  AL) .  (35) 

In  contrast,  when  a  rich  agent  pays  the  tax  rate  (while  all  others  evade  taxes)  his  utility 
would  be 

uH{z  =  l)    =     (1-t)Ah,  (36) 

<     (1-t)Ah +  t(Ah -AL), 
=    uH(z  =  0). 

Next  let  f  denote  the  unique  value  of  t  satisfying  the  government  budget  constraint  (32), 
at  the  candidate  equilibrium  with  the  rich  agents  evading  taxes 

„(^Y*—toAL  +  K=[n-ir(fE)]TJSAL,  (37) 

Qo 
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where  7r  (■)  is  again  defined  in  (1). 

As  in  the  main  analysis,  there  are  two  cases  to  consider  depending  on  whether  n  —  n  (te) 
is  greater  than  or  less  than  1/2.  Here  we  simplify  the  analysis  by  focusing  on  the  case  where 
there  are  sufficiently  many  bureaucrats  so  that,  together  with  the  rich,  they  are  the  absolute 
majority,  i.e.,  n  —  tt  (t  )  <  1/2.  The  converse  case  with  n  —  tt  (r  )  >  1/2  necessitates  that 
the  rich  create  an  inefficiently  large  bureaucracy  in  order  to  win  the  election.  Since  the  results 
in  this  case  are  again  similar,  we  do  not  discuss  them  in  this  extension. 

Lemmas  5  and  6  continue  apply  in  this  modified  environment.  In  particular,  if  bureaucrats 
ever  vote  for  the  poor,  there  is  a  permanent  transition  to  an  equilibrium  with  an  efficient  state 
with  the  poor  in  power  within  one  period  from  the  election.  The  following  lemma  characterizes 
the  policy  vector  that  the  poor  would  implement  in  the  period  they  win  the  election  when  the 
existing  organization  of  the  state  is  inefficient  and  also  the  policy  vector  that  they  will  choose 
when  the  state  is  efficient. 

Lemma  17  Suppose  that  d±  =  P  and  consider  the  following  maximization  program: 

max  (1  -  r)  AL  +  G 

T,G,W 

subject  to 
G    =     z{l-n)rAH  +  [n-Tr(T)]TAL  -  vm  (t)  -  K 

p  (tt  (r))  max  ^AL  -  JZ^m^  o|  +  [1  -  p  (tt  (r))]  AL  <  (1  -  r)  AL, 

p  (tt  (t))  max  j  AH  -     y  '     w,  0  1  +  [1  -  p  (tt  (r))]  AH  <  (1  -  r)  AH    and  z  =  1,  or  z  =  0, 

and  (1  —  t)  Al  <  w,  where  z  6  {0, 1}  denotes  the  decision  of  the  rich  whether  to  pay  taxes. 
Then  the  policy  vector  that  the  poor  would  choose  when  It-i  =  0,  rt  —  f  ,  Gt  =  GD ,  wt  =  wD , 
is  given  by  the  solution  to  this  program  when  q  (I)  =  q$.  The  policy  vector  that  the  poor  would 
choose  when  It-\  =  1,  is  given  by  Tt  =  t  ,  Gt  =  GD,  w  —  wD ,  when  q  (I)  =  1  and  the  first 
term  in  the  last  two  inequalities  is  equal  to  zero. 

The  penultimate  inequality  in  the  maximization  program  in  this  lemma  represents  the  "no 
tax  evasion"  constraint  for  the  poor,  while  the  last  constraint  allows  the  program  to  choose 
whether  or  not  to  satisfy  the  no  tax  evasion  constraint  of  the  rich.  Notice  that  if  this  last 
constraint  is  satisfied,  the  penultimate  one  will  also  be  satisfied  automatically  (since  AH  >  AL). 
When  It-\  —  1  and  the  state  is  efficient,  bribery  is  not  possible  and  the  max  term  in  the  last 
two  inequalities  becomes  zero. 
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The  next  two  lemmas  are  the  analogues  of  Lemmas  8  and  9  and  determine  the  conditions 
under  which  the  bureaucrats  are  willing  to  vote  for  the  rich,  and  the  rich  prefer  the  allocation 
in  which  they  are  in  power  to  the  one  in  which  the  poor  are  in  power.  Since  their  proofs  are 
similar  to  those  of  Lemmas  8  and  9,  they  are  omitted. 

Lemma  18  In  an  MPE,  the  rich  will  win  the  election  at  time  t   (i.e.,   dt  =  R)  if  only  if 

Itr-l    =   0. 

1  -<7o 


AL  >  (1  -  P)  (wD  +  GD)  +  0  [(1  -  td)  AL  +  GD~\  ,  (38) 


<?0 

where  wD ,  GD ,  f     and  GD  are  defined  in  Lemma  17. 

Condition  (38)  implies  that  the  bureaucrats  prefer  to  be  in  an  inefficient  state  under  the 
rule  of  the  rich,  given  the  "wage  policy"  that  is  optimal  for  the  rich,  rather  than  voting  for  the 
poor.  In  fact,  if  they  vote  for  the  rich,  the  bureaucrats  obtain  a  wage  equal  to  (1  —  go)  AL/qo, 
whereas  if  they  vote  for  the  poor,  they  obtain  a  wage  of  wD  and  public  with  provision  of  GD 
for  one  period  (while  the  state  is  inefficient),  and  subsequently  a  payoff  equal  to  the  payoff  of 
the  poor  under  an  efficient  state. 

Lemma  19  Suppose  that  condition  (38)  holds.   Then,  the  rich  prefer  to  set  It  =  0  for  all  t  if 
the  following  condition  is  satisfied 

[I  _  $J>)  AH  +  £D]   <   [j  _  p  fr  (^))]  AH  +  p  ^  (-*))  (AH  _  AL)  (39) 

where  f     and  GD  are  defined  in  Lemma  17,  and  f     is  given  by  (37). 

Proof.  It  is  immediate  that  (39)  is  sufficient  to  ensure  that  the  rich  prefer  to  be  in  power 
with  an  inefficient  state,  set  the  tax  rate  f  ,  evade  taxes  and  pay  bribes  equal  to  AL  with 
probability  p  (jr  (f  ) )  to  living  under  democracy  with  taxes  and  public  good  provision  given 
by  t     and  GD  as  in  Lemma  17.    ■ 

Condition  (39)  states  that  the  payoff  to  the  rich  when  the  state  is  efficient  (and  the  poor 
are  in  power)  is  lower  than  the  expected  payoff  that  they  get  when  the  state  is  inefficient  (and 
they  are  in  power).  The  latter  payoff  reflects  the  following  facts:  only  the  poor  pay  taxes,  tax 
payers  are  inspected  with  probability  p  (ir  Or  ))  =  f  ,  and  the  rich  offer  a  bribe  equal  to  AL 
to  the  bureaucrat  inspecting  them. 

The  following  proposition  characterizes  the  equilibrium  with  bureaucratic  corruption.  Since 
its  proof  follows  that  of  Proposition  3  closely,  it  is  omitted. 
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Proposition  8  Consider  the  political  environment  with  emerging  democracy.  Then,  if  con- 
ditions (38)  and  (39)  hold,  the  unique  MPE  is  one  in  which  the  rich  elite  choose  It  =  0  in 
the  initial  period  and  for  all  t  thereafter,  the  rich  party  R  always  remains  in  power  and  the 
following  policies  are  implemented  at  all  dates: 

Wt    =    tf*  =  lzMA\    Xt  =  *(**), 

<7o 

Gt    =    GE  =  0,  and  rt  =  te, 

where  te  is  given  by  (37).  Moreover,  only  the  poor  pay  taxes,  while  the  rich  evade  taxes  and 
pay  a  bribe  equal  to  b  =  AL  when  inspected. 

If,  on  the  other  hand,  one  or  both  of  conditions  (38)  and  (39)  hold  with  the  reverse  inequal- 
ity, the  unique  MPE  involves  It  =  1  for  all  t,  and  for  all  t  >  1,  dt  —  P  and  the  unique  policy 
vector  is 

wt    =    tiD=(l-¥>)AL,    Xt  =  n{TD), 

Gt    =    GD  =  (l-n)fDAH+[n-7T(fD)}fDAL-{l-TD)AL7r{fD)-K, 

where  t     and  G     are  defined  in  Lemma  17. 

Proof.  The  first  part  of  the  proposition  follows  from  Lemmas  16-19.  The  only  part  that 
remains  to  be  proved  is  that  when  one  or  both  of  conditions  (38)  and  (39)  hold  with  the  reverse 
inequality,  the  poor  will  be  in  power.  To  see  this,  note  that  these  conditions  (with  the  reverse 
inequality)  are  sufficient  for  the  rich  to  prefer  democracy  to  setting  up  an  inefficient  state  and 
evading  taxes.  Moreover,  as  before,  if  the  state  is  efficient  (i.e.,  It  =  1),  the  poor  will  be  in 
power.  Therefore,  we  only  have  to  show  that  the  rich  elite  would  not  prefer  an  inefficient  state 
and  no  tax  evasion.  This  is  straightforward  since  to  prevent  tax  evasion  by  themselves,  the 
rich  would  have  to  set  a  higher  tax  rate  than  f  ,  since  the  "no  tax  evasion  constraint"  for  the 
rich  under  It~i  =  0  is 

p  (tt  (t))  max  (  AH ^—w,  ol  +  [1  -  p  (tt  (t)')1  Ah  <  (1  -  r)  AH . 

I  1-90         J 

At  f  ,  this  constraint  is  violated  (since  it  is  satisfied  as  equality  for  AL).  Thus,  this  con- 
straint will  be  satisfied  at  some  tax  rate  r'  >  fE,  which  would  give  a  per-period  utility  of 
(1  —  t')  Ah  to  rich  agents,  which  is  strictly  less  than  p  (n  (rE))  max  [AH  —  qow/  (1  —  qo) ,  0}  + 
[l  —  p  (tt  (v  ))]  AH .  Therefore,  the  rich  are  always  better  off  evading  taxes  when  in  power. 
This  establishes  that  conditions  (38)  and  (39)  holding  with  reverse  inequality  are  sufficient  for 
the  equilibrium  with  the  poor  in  power  to  emerge.    ■ 
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The  most  interesting  result  in  Proposition  8  is  that,  when  they  are  able  to  capture  de- 
mocratic politics,  the  rich  do  not  pay  any  taxes  at  all.  Instead,  they  (sometimes)  pay  bribes 
equal  to  the  tax  burden  on  poor  agents,  AL.  This  implies  that  patronage  politics  turns  de  jure 
proportional  taxation  into  a  de  facto  regressive  one.  In  other  words,  when  the  rich  elite  are 
able  to  set  up  an  inefficient  state  and  receive  the  support  of  bureaucrats,  they  are  not  only  able 
to  limit  redistribution  and  public  good  provision,  but  they  are  also  able  to  shift  most  of  the 
burden  of  taxation  to  the  poor.  Consequently,  the  tax  rate  faced  by  the  poor  may  be  higher 
when  corruption  is  possible  than  in  the  baseline  model  where  both  rich  and  poor  pay  taxes. 

6     An  Empirical  Implication  and  Some  Evidence 

A  distinctive  empirical  implication  of  our  model  is  that  democracies  where  relative  wages  of 
bureaucrats  are  high  should  provide  fewer  public  goods.  This  is  because,  all  else  equal,  bureau- 
crats are  paid  higher  relative  wages  when  the  elite  use  patronage  politics  to  limit  redistribution 
and  public  good  provision.  In  contrast,  a  naive  intuition  may  suggest  that  relative  wages  of 
bureaucrats  and  public  good  provision  should  be  correlated  positively,  either  because  when 
there  is  greater  provision  of  public  goods,  more  activities  are  entrusted  to  bureaucrats  and 
they  need  to  be  paid  more,  or  because  countries  with  a  greater  willingness  to  tax  will  spend 
more  both  on  public  employment  and  on  public  good  provision.17 

We  next  look  at  the  cross-country  correlation  between  the  relative  wages  of  bureaucrats 
and  public  good  provision  among  democracies.  Our  measure  of  the  relative  wage  of  bureau- 
crats is  average  wage  of  public-sector  employees  relative  to  GDP  per  capita  from  World  Bank 
for  1991-2000.  Our  main  measure  of  public  good  provision  is  total  (central)  government  ex- 
penditure as  a  fraction  of  GDP  for  1991-1998,  and  we  also  look  at  social  services  and  welfare 
spending  as  a  fraction  of  GDP  as  an  alternative  dependent  variable.18  Both  of  these  variables 
are  from  the  IMF's  International  Financial  Statistics  (see  the  details  in  the  Appendix).  To 
focus  on  democracies,  we  limit  the  sample  to  countries  with  an  average  Polity  score  greater 
than  or  equal  to  5  over  the  period  1991-1998,  which  corresponds  to  "stable  democracies"  (see 
Persson  and  Tabellini,  2003).  Our  baseline  sample  contains  51  observations.  Figure  1  shows 
the  correlation  between  the  relative  wages  of  bureaucrats  and  government  expenditure  share  of 


17An  alternative  intuition  may  be  that,  with  a  fixed  government  budget,  higher  public  sector  wages  would 
force  the  government  to  reduce  the  rest  of  public  good  expenditures.  In  practice,  there  is  considerable  variation 
in  the  level  of  government  budgets,  and  we  will  see  below  that  same  results  apply  with  a  measure  of  spending 
on  social  services  and  welfare. 

We  choose  the  total  government  expenditure  as  our  main  measure,  both  because  wc  have  more  observations 
on  this  variable  and  also  because  the  alternative,  social  services  and  welfare  spending  share  of  GDP,  is  heavily 
influenced  by  the  age  structure  of  the  population.  See  below. 
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GDP.  A  strong  negative  relationship  is  visible  in  the  figure,  with  most  European  countries  hav- 
ing lower  relative  wages  for  bureaucrats  and  government  expenditures  than  in  Latin  American 
and  Asian  countries  (there  are  few  African  countries  in  our  sample). 

The  regression  corresponding  to  Figure  1  is  shown  in  column  1  of  Table  1,  with  the  ro- 
bust standard  errors  in  parentheses.  The  correlation  between  relative  bureaucratic  wages  and 
government  expenditure  share  of  GDP  is  statistically  highly  significant,  with  a  t-statistic  of 
approximately  5.  Column  2  of  the  table  controls  for  GDP  per  capita.  Richer  countries  spend 
more  on  public  goods  and  income  per  capita  is  also  correlated  with  relative  bureaucratic  wages. 
This  regression  shows  that  log  income  per  capita  is  indeed  significant,  but  the  relationship  be- 
tween relative  bureaucratic  wages  and  government  expenditures  remains  strong  (the  coefficient 
declines  from  -4.96  to  -3.63,  which  continues  to  be  significant  at  less  than  1%).  Column  3  also 
controls  for  the  Polity  democracy  score,  which  is  insignificant  and  has  little  effect  on  the  coef- 
ficient of  the  relative  wage  of  bureaucrats.  Figure  2  shows  the  conditional  correlation  between 
relative  bureaucratic  wage  and  government  expenditure  share  of  GDP  corresponding  to  column 
3  of  Table  1.  The  same  negative  relationship  as  in  Figure  1  is  again  visible. 

Column  4  controls  for  the  age  structure  of  the  population,  in  particular  the  fraction  of 
the  population  between  the  ages  of  15-64  and  the  fraction  over  the  age  of  65.  We  expect 
the  age  structure  of  the  population  to  have  a  direct  effect  on  Social  Security  spending  and 
thus  also  on  total  government  expenditures.  The  results  in  column  4  show  that  controlling  for 
the  age  structure  variables  significantly  reduces  both  the  coefficient  estimate  of  the  relative 
bureaucratic  wage  and  the  standard  errors.  The  coefficient  estimate  is  now  -1.82,  with  a 
standard  error  of  0.85,  which  is  still  significant  at  5%. 

Columns  5-8  repeat  the  same  regressions  using  social  services  and  welfare  spending  as  a 
percentage  of  GDP  as  the  dependent  variable.  The  results  are  similar  to  those  for  government 
expenditure  and  typically  stronger,  except  when  we  control  for  the  age  structure  variables.  In 
particular,  in  columns  5-8,  relative  bureaucratic  wage  is  significant  at  less  than  1%.  Once  we 
include  the  controls  for  the  age  structure  of  the  population,  however,  the  relationship  between 
the  relative  bureaucratic  wage  and  social  services  and  welfare  spending  is  no  longer  significant; 
the  coefficient  estimate  declines  significantly  and  the  standard  error  doubles.  This  result  might 
reflect  the  fact  that  social  services  and  welfare  spending  are  closely  related  to  the  age  structure 
of  the  population  and  there  is  little  cross-sectional  variation  left  once  we  control  for  the  age 
structure  variablesas. 

In  addition  to  the  results  shown  in  Table  1,  we  have  also  experimented  with  including 
"semi-stable"  democracies  (those  with  Polity  scores  between  0  and  5).  The  results  are  similar 
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but  slightly  weaker.  The  results  are  also  similar  when  we  construct  the  sample  using  Freedom 
House  measures  of  political  and  civil  rights.  We  also  checked  the  robustness  of  the  results  to 
various  other  controls.  The  results  are  broadly  similar  when  we  control  for  the  legal  origin  of 
the  country,  for  parliamentary  versus  presidential  systems,  for  majoritarian  versus  proportional 
democracies,  and  for  the  age  of  democracy.  Nevertheless,  the  results  are  significantly  weakened 
or  disappear  when  we  control  for  a  full  set  of  continent  dummies.  This  is  not  entirely  surprising, 
since,  as  Figures  1  and  2  show,  the  results  reflect  the  contrast  of  European  countries  to  Latin 
American  and  Asian  countries. 

Overall,  it  appears  that  there  is  a  significant  negative  relationship  between  government 
expenditure  and  the  relative  wages  of  bureaucrats,  which  becomes  weaker  when  we  control 
for  the  age  structure  of  the  population  and  for  continent  dummies.  While  this  cross-country 
correlation  is  not  as  robust  as  we  would  like  it  to  be,  it  is  nonetheless  encouraging  for  our 
approach,  since  the  negative  relationship  between  relative  wages  of  bureaucrats  and  government 
expenditure  is  a  counter-intuitive  implication  of  our  model  and  a  naive  intuition  would  have 
suggested  the  opposite  relationship  between  these  two  variables. 

7     Concluding  Remarks 

Inefficiencies  in  the  bureaucratic  organization  of  the  state  are  often  viewed  as  an  important 
factor  in  retarding  economic  development.  Many  sociological  accounts  of  comparative  devel- 
opment emphasize  the  role  of  state  capacity  (or  lack  thereof)  in  explaining  why  some  societies 
are  able  to  industrialize  and  modernize  (e.g.,  Evans,  1995,  Migdal,  1988).  In  addition,  inef- 
ficient state  organizations  appear  to  coincide  with  limited  amounts  of  public  good  provision 
and  redistribution  towards  the  poor.  Existing  approaches  do  not  address  the  question  of  why 
certain  societies  choose  or  end  up  with  such  inefficient  organizations  and  do  not  clarify  the 
relationship  between  inefficient  state  organizations  and  limited  redistribution. 

We  presented  a  simple  theory  of  the  emergence  and  persistence  of  inefficient  states,  in  which 
the  organization  of  the  public  bureaucracy  is  manipulated  by  the  rich  elite  in  order  to  influence 
redistributive  politics.  In  particular,  by  instituting  an  inefficient  state  structure,  the  elite  are 
able  to  use  patronage  and  capture  democratic  politics.  This  enables  them  to  limit  the  extent  of 
redistribution  and  public  good  provision.  Captured  democracies  not  only  limit  redistribution, 
but  also  create  a  number  of  major  distortions:  the  structure  of  the  state  is  inefficient,  there  is 
too  little  public  good  provision  and  there  may  be  overemployment  of  bureaucrats. 

We  also  showed  that  an  inefficient  state  creates  its  own  constituency  and  tends  to  persist 
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over  time.  Intuitively,  an  inefficient  state  structure  creates  more  rents  for  bureaucrats  than 
would  an  efficient  state  structure.  When  the  median  (poor)  agent  comes  to  power  in  democracy, 
he  will  reform  the  structure  of  the  state  to  make  it  more  efficient  so  that  the  higher  taxes  can 
be  collected  at  lower  cost  (especially  in  terms  of  lower  rents  for  bureaucrats).  Anticipating 
this,  when  the  organization  of  the  state  is  inefficient,  bureaucrats  support  the  rich,  who  set 
lower  taxes  but  pay  high  wages  to  bureaucrats.  In  order  to  generate  enough  political  support, 
the  coalition  of  the  rich  and  the  bureaucrats  may  not  only  choose  an  inefficient  organization  of 
the  state,  but  they  may  further  expand  the  size  of  bureaucracy  so  as  to  gain  additional  votes. 

The  model  shows  that  an  equilibrium  with  an  inefficient  state  is  more  likely  when  there 
is  greater  income  inequality  and  when  democratic  taxes  are  anticipated  to  be  higher.  An 
interesting  implication  of  this  result  is  that  inequality  and  redistribution  may  be  negatively 
correlated  because  higher  inequality  makes  the  capture  of  democratic  politics  more  likely. 

The  pattern  of  elite  control  in  democracy  based  on  patronage  politics  and  the  emergence  of 
an  inefficient  state  organization  bears  some  resemblance  to  the  inefficient  bureaucratic  struc- 
tures in  a  number  of  countries.  In  addition  to  these  case  studies,  we  provided  cross-country 
correlations  consistent  with  a  distinctive  implication  of  our  model,  that  among  democracies 
there  should  be  a  negative  relationship  between  the  relative  wages  of  state  employees  and  the 
amount  of  public  good  provision. 

The  general  message  from  our  analysis  is  that  "not  all  democracies  are  created  equal" ;  while 
some  democracies  will  adopt  policies  that  redistribute  to  poorer  segments  of  the  society,  others 
may  become  captured  by  traditional  elites.  These  captured  democracies  not  only  choose  low 
levels  of  redistribution,  but,  as  part  of  their  political  rationale  for  survival,  they  also  typically 
create  a  range  of  inefficiencies.  Our  model  suggests  that  these  inefficiencies  might  be  related 
to  the  relatively  poor  performance  of  a  number  of  democracies  in  Latin  America  and  Asia.19 

Analyses  of  the  effect  of  such  policies  on  economic  growth  and  investigations  of  other 
methods  via  which  the  rich  may  limit  the  amount  of  redistribution  in  democratic  politics 
are  interesting  areas  for  future  work.  Another  interesting  area  for  further  study  is  a  more 
careful  empirical  analysis  of  the  relationship  between  the  variation  in  the  extent  of  government 
expenditure,  relative  wages  of  state  employees  and  potential  elite  capture  of  democratic  politics. 


1  Another  potential  political  factor  in  the  poor  economic  performances  of  Latin  American  democracies  is 
"populism".  Why  some  countries  pursue  populist  policies  is  beyond  the  scope  of  the  current  paper.  Nevertheless, 
it  may  be  conjectured  that  the  political  environment  may  be  more  conducive  to  populism  when  the  majority  of 
the  population  fare  relatively  badly  under  democracy  (see  Acemoglu,  2007)  and  the  type  of  democratic  capture 
studied  in  this  paper  is  likely  to  limit  the  benefits  of  democracy  for  the  majority  of  the  population. 
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Appendix  A:  Omitted  Proofs 

7.1     Proof  of  Proposition  5 

Consider  changes  in  inequality  that  keep  mean  pre-tax  income,  Y  =  (1  —  n)  AH  +  nAL,  constant.  This 
implies  the  following  simple  relationship  between  the  pre-tax  incomes  of  rich  and  poor  agents: 

„„       Y-nAL  ,     , 

AH  =  — .  40) 

1  —  n 

To  prove  the  desired  result,  we  need  to  show  that  (15)  and  (21)  in  Lemmas  8  and  9  are  more  likely 
to  hold  when  there  is  greater  inequality,  i.e.,  when  AL  is  lower  (and  AH  is  given  by  (40)). 
Let  us  rewrite  condition  (15)  as 


h 
—  > 


(1  _  rD)  AL  +  GD+  l~j^GD 


<7o-i-goLv"        /_~  '     P  (41) 

Next,  consider  condition  (21)  in  Lemma  9.  Suppose  first  that  te  >  r  ,  i.e.,  when  X  =  -n  (rE)  (recall 
that  more  generally  X  =  max  {n  (te)  ,  n  —  1/2}).  Then,  combining  the  government  budget  constraint 
(19)  with  (40)  gives 
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"  Y-n{TE)ALq0       Y -tt(te)Al' 
Substituting  for  te  from  this  expression,  condition  (21)  can  be  rewritten  as 

h        f  GE\Y-n(rE)AL  K      _ 

Instead,  when  rE  <  f  ,  the  size  of  the  bureaucracy  is  X  =  n  —  1/2  =  A.  Solving  for  f  from  (20) 
and  (40)  as 

E       Xh/qp  +  K 

Y-\AL  '  [     ' 

the  relevant  part  of  condition  (21),  (l  -  fE\  AH  >  (1  -  (3)  (l  -  te)  Ah  +  0  [(l  -  td)  Ah  +  GD] ,  can 
be  expressed  as 

-  <  \  {  [(1  -  13)  te  +  0  (td  -  GD/AH)}  (Y  -  \AL)  -K}=T.  (45) 

These  three  expressions  define  9,  6  and  8  .  Now  summarizing  our  analysis,  an  inefficient  state  will 
be  created  under  two  different  scenarios: 

1.  if  X  =  7r  (te)  and  if  conditions  (41)  and  (43)  are  satisfied,  which  requires 

6  <  h/q0  <  6; 

2.  if  X  =  n  —  1/2  =  A  and  if  conditions  (41)  and  (45)  are  satisfied,  which  requires 

9<  h/q0  <T. 

We  will  prove  that  higher  inequality  makes  the  inefficient  state  equilibrium  more  likely  by  showing 
that  the  upper  thresholds  (9  in  case  1  and  9  in  case  2)  are  increasing  and  the  lower  threshold,  9,  is 
decreasing  in  the  level  of  inequality — these  naturally  imply  that  the  intervals  A  =  8  —  9  and  A*  =  8  —9 
increase  with  income  inequality.  We  will  also  show  that  an  increase  in  inequality  does  not  cause  a  switch 
from  1  to  2  or  vice  versa  in  a  way  to  make  the  inefficient  state  less  likely. 

We  first  establish  an  intermediate  result: 
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Claim  1    We  have 
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dr1 


8rD  _  1  +  tt'  (tp) 

dAL  ~  ~  tt"  (rD)  (AL  +  h) 

TETT  (rE) 


<0, 


8AL       Y-ir'{TE){TEAL  +  h/q0)-Tr{TB)AL 
Proof.  The  first-order  condition  of  program  (9)  for  an  interior  td  is 

dGD       iL 


>0. 


8t 


AL. 


Using  (40) ,  the  equilibrium  level  of  the  public  good  (8)  provided  by  the  poor  is 

GD  =  tdY  -  K  -  tt  (td)  (Al  +  h)  . 
The  first-order  condition  (48)  therefore  becomes 

Y  -  tt'  (td)  (Al  +  h)  -  AL  =  0. 


(46) 

(47) 

(48) 
(49) 
(50) 


The  solution  for  td  is  always  positive  since  K  >  0  needs  to  be  financed.  Moreover,  the  assumption 
that  td  <  1  -  tt  (td)  <  1  ensures  that  rD  <  1.  Differentiating  (50)  gives  (46). 

Next,  differentiating  the  government  budget  constraint  (19)  and  using  (40)  gives  (47),  where  the 
denominator  is  positive  since  te  is  always  to  the  left  of  the  peak  of  the  Laffer  curve.    ■ 

Next,  given  the  definition  of  8  in  (41)  we  obtain  that 
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where  the  expression  for  the  limit  uses  the  fact  that  dGD  (AL)  jdAL  exists,  is  finite  and  is  independent 
on  /3,  and  the  inequality  again  follows  from  the  assumption  that  td  <  1  —  tt  (td)  <  1.  This  inequality 
implies  that  for  sufficiently  high  /?,  the  inefficient  state  becomes  more  attractive  to  bureaucrats  as  the 
level  of  inequality  increases. 

We  now  show  that  higher  inequality,  represented  by  a  decrease  in  AL  with  AH  given  by  (40),  makes 
the  inefficient  state  also  more  attractive  to  the  rich  by  increasing  9  and  6  . 

Consider  two  cases: 

Case  1:  X  =  tt(te). 

From  (49),  we  have 

TDY  =  Gd  +  K  +  tt  (td)  (Al  +  h) .  (52) 

Substituting  (52)  into  (43)  and  some  algebra  gives 
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Next,  note  that 
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and  from  (49),  using  (50), 
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Differentiating  (53),  in  turn,  gives 
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which  can  be  rewritten  as 
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Since  8te/8Al  >  0,  8td/8Al  <  0  and  n  >  tt  (te),  all  the  terms  in  (55)  except  the  last  line, 
tt  (td)  /tt  (te),  are  negative.  Furthermore,  we  have 


tt(te)  8A^A    +  h>  + 


tt(te)        tt(te) 


-tt'  (rD) 


1  +  tt'  {td) 
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where  the  right  and  side  of  (56)  is  obtained  using  the  expression  for  8rD /8AL  in  (46).  Now,  the  log- 
concavity  of  tt  (t)  implies  that  [tt'  (t)]  >  tt"  (t)  tt  (t)  and  is  sufficient  to  ensure  that  (56)  is  negative. 
This  implies  that  all  the  terms  in  (55)  including  tt  {td)  /tt  (te)  are  negative,  and  therefore  89/8AL  <  0 
as  desired.  Note  that  this  conclusion  holds  irrespective  of  the  value  of  /3. 

Case  2:  X  =  n  -  1/2  =  A. 

The  proof  parallels  that  of  case  1.  From  (45),  using  (52)  and  the  fact  that  Y  —  XAL  =  (1  —  n)  AH  + 
(1/2)AL,  we  obtain 
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Differentiating  (57)  and  using  (54),  (58)  and  the  fact  that  8rE /8AL  exists  and  is  independent  of  /?, 
we  have 

/3-i  8AL  A  1     v      ;  v      '  8AL  v  '  dAL  dAL  v      ' 
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which  can  be  rewritten  as 
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Since  8td /8AL  <  0,  tt'  (td)  >  0,  n  >  1/2  and  ^L  <  AH ,  all  terms  in  (59)  other  than  n  (td)  are 
negative.  Therefore,  a  sufficient  condition  to  ensure  that  (59)  is  negative  is 

*'  (rD)  §g  (AL  +  h)  +  *  (t°)  =  -«>  (r°)  1  ±^ff  +  n  (r°)  <  0,  (60) 

where  the  right  hand  side  of  (60)  is  obtained  using  the  expression  for  drD /8AL  in  (46).  This  condition  is 
equivalent  to  (56)  and  the  log-concavity  of  7r  (t)  is  sufficient  to  ensure  it.  This  establishes  that  88  /8AL 
is  negative  for  sufficiently  high  ft  as  desired. 

The  proof  so  far  has  established  that  the  lower  threshold  8  declines  as  inequality  increases  and  that 
the  upper  thresholds  8  and  8  increase  as  inequality  increases.  To  complete  the  proof  of  the  proposition, 
we  need  to  ensure  that  there  would  be  no  switch  from  the  wider  to  the  smaller  interval,  which  could 
happen  if  we  have  a  switch  from  te  >  f  to  te  <  f  ,  or  vice  versa.  However,  as  /3  — >  1,  we  have  that 
(21)  is  equivalent  to 

if     rE>fE      &     X  =  max{^(TB),n-l/2}=W(rB)  and      (td  -  te)  Ah  >  GD 

if     te  <te      &     X  =  max  {tt  {te),ti-  1/2}  =n-  1/2  =  A      and      (rD  -  f £)  AH  >  GD 

so  that  for  (5  sufficiently  large,  at  the  point  of  a  possible  switch,  te  =  f  ,  we  have  8  =  8.  This 
completes  the  proof.  ■ 

7.2     Proof  of  Lemma  16 

Suppose  that  the  bureaucratic  wage  is  given  by  wE  in  (33).  Then  the  incentive  compatibility  constraint 
(28)  of  the  bureaucrats  inspecting  low-skill  producers  is  satisfied  even  when  the  producers  offer  a  bribe 
as  large  as  their  income  AL.  Holding  r  fixed,  a  decrease  in  w  from  wE  will  allow  bureaucrats  to  accept 
bribes,  and  thus  reduce  government  revenues  to  zero.  Therefore,  it  cannot  be  optimal.  Increasing  w  is 
also  not  beneficial  for  the  rich. 

Condition  (31),  on  the  other  hand,  ensures  that  the  poor  choose  to  pay  taxes.  Holding  w  fixed  at 
wE,  increasing  taxes  would  induce  the  poor  not  to  pay  taxes  and  is  therefore  not  beneficial.  Reducing 
taxes  is  also  not  beneficial.  Given  (33)  and  (34),  it  is  also  straightforward  to  verify  that  the  utility  of 
the  rich  is  decreasing  in  G,  so  that  this  variable  is  set  equal  to  zero. 

This  argument  shows  that  (33)  and  (34)  gives  a  stationary  point  of  the  optimization  problem  of  the 
rich,  since  the  rich  will  not  find  it  beneficial  to  change  either  one  of  x  or  w  by  itself.  To  complete  the 
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proof,  we  need  to  show  that  it  is  also  not  beneficial  to  change  x  and  w  simultaneously.  We  will  do  this 
by  showing  that  the  payoff  function  of  the  rich  is  strictly  quasi-concave.  Consider  the  problem  of  the 
rich  if  they  do  not  pay  taxes.  Clearly,  constraints  (31)  and  (32)  in  equilibrium  hold  as  equalities.  We 
can  thus  solve  out  for  the  tax  rate  from  the  government  budget  constraint  as 


xw  +  K  +  G 

T  = 


(n  -  x)  AL 
Substituting  this  expression  into  (31),  we  obtain 


w  = K  +  G 

[Qo/{l-<lo)](n-x)p(x)-x' 

provided  that  w  >  (1  —  qo)  AL/qo  (which  will  be  true  in  equilibrium).  Now,  substituting  (61)  in  the 
objective  function  of  the  rich  (30),  and  observing  that  this  is  maximized  at  G  =  0,  we  can  represent  the 
problem  of  the  rich  as  the  following  single  dimensional  maximization  problem 

maxt/  (x)  =  AH  - 


(n-z)  -  (l-q0)x/q0p(x) 


If  this  problem  is  strictly  quasi-concave,  it  must  have  a  unique  solution.  Corresponding  to  this 
unique  x,  there  will  be  unique  levels  of  r  and  w,  since  these  variables  are  defined  uniquely  by  the 
previous  equalities.  To  check  that  this  function  is  indeed  strictly  quasi-concave,  note  that 

,,,/   ,,  1  +  (1  -  go)  /q0p  (x)  -  (1  -  go)  \p'  jx)  x]  /q0  \p  jx)}2 

U    (X)  = 2 -"■• 

[(n-x)  -  (1  -  qo)x/q0p(x)} 

For  U  (x)  to  be  strictly  quasi  concave,  it  is  sufficient  that  its  second  derivative  is  negative  when 

U'  (x)  =  0.    For  this,  it  is  sufficient  for  -  I  \p  (re)]2  +  (1  -  q0)p(x)  /q0  —  (1  -  go)  \p'  (x)  x]  /go  [  to  be 

strictly  decreasing  in  x.  It  can  be  easily  verified  that  this  is  always  the  case  since  p  (x)  is  increasing  and 
concave  in  x.  This  completes  the  proof  that  (33)  and  (34)  is  optimal  for  the  rich  when  they  prefer  not 
to  pay  taxes.  ■ 
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Appendix  B:  Data  Sources  and  Definitions 

Our  dataset  builds  on  the  cross-country  dataset  compiled  by  Persson  and  Tabellini  (2003)  (henceforth 
PT).  Our  sample  of  "stable  democracies"  consists  of  countries  with  an  average  Polity  score  greater  than 
or  equal  to  5  over  the  period  1991-1998.  We  also  report  results  with  a  sample  containing  all  democracies 
(defined  as  countries  with  average  Polity  score  greater  than  0  over  the  period  1991-1998).  With  the 
exception  of  the  average  government  wage  to  per  capita  GDP  (which  comes  from  the  World  Bank),  all 
our  variables  are  from  PT's  dataset.  These  variables  are  the  following: 

Central  government  expenditures  as  a  percentage  of  GDP:  constructed  using  the  item 
Government  Finance-Expenditures  in  the  IFS,  divided  by  GDP  at  current  prices  and  multiplied  by  100. 
Source:  IMF-IFS  CD-Rom  and  IMF-IFS  Yearbook. 

Consolidated  central  government  expenditures  on  social  services  and  welfare  as  per- 
centage of  GDP:  from  the  GFS  Yearbook,  divided  by  GDP  and  multiplied  by  100.  Source:  IMF-GFS 
Yearbook  2000  and  IMF-IFS  CD-Rom. 

Log  GDP  per  capita:  per  capita  real  GDP  defined  as  real  GDP  per  capita  in  constant  dollars 
(chain  index)  expressed  in  international  prices,  base  year  1985.  Data  through  1992  are  taken  from  the 
Penn  World  Table  5.6,  while  data  on  the  period  1993-98  are  computed  from  data  taken  from  the  World 
Development  Indicators,  the  World  Bank.  These  later  observations  are  computed  on  the  basis  of  the 
latest  observation  available  from  the  Penn  Word  Tables  and  the  growth  rates  of  GDP  per  capita  in  the 
subsequent  years  computed  from  the  series  of  GDP  at  market  prices  (in  constant  1995  U.S.  dollars)  and 
population,  from  the  World  Development  Indicators. 

Sources:    Penn  World  Tables  -  mark  5.6  (PWT),  available  on 

http://datacentre2.chass.utoronto.ca/pwt/docs/topic.html. 

The  World  Bank's  World  Development  Indicators,  available  on 

http://www.worldbank.org. 

Polity:  score  for  democracy,  computed  by  subtracting  the  AUTOC  score  from  the  DEMOC  score, 
and  ranging  from  +10  (strongly  democratic)  to  -10  (strongly  autocratic).  AUTOC  (DEMOC)  is 
the  index  of  autocracy  (democracy),  derived  from  codings  of  the  competitiveness  of  political  participa- 
tion, the  regulation  of  participation,  the  openness  and  competitiveness  of  executive  recruitment,  and 
constraints  on  the  chief  executive. 

Source:  Polity  IV  Project  (http://www.cidcm.umd.edu/inscr/polity/index.htm). 

Age  structure  variables:  percentage  of  population  between  the  ages  of  15  and  64  in  the  total 
population  and  percentage  of  population  over  the  age  of  65  in  the  total  population. 

Source:  World  Development  Indicators  CD-Rom  1999. 

Average  government  wage  relative  to  per  capita  GDP:  mean  value  of  the  average  government 
wage  to  per  capita  GDP  between  1991  and  2000.  It  is  computed  as  the  average  of  the  two  data  points 
available  for  the  periods  1991-95  and  1996-2000.  When  data  for  one  of  the  two  periods  are  not  available, 
only  the  available  time  period  is  used.  The  variable  is  calculated  by  dividing  the  average  government 
wage  by  the  GDP  per  capita  figure.  The  average  government  wage  is  calculated  as  the  total  central 
government  wage  bill  divided  by  the  number  of  employees  in  total  central  government.  The  total  central 
government  wage  bill  is  the  sum  of  wages  and  salaries  paid  to  central  government  employees,  including 
armed  forces  personnel.  The  number  of  employees  in  total  central  government  is  the  sum  of  total  civilian 
central  government  and  the  Armed  Forces. 

Source:  Schiavo-Campo,  de  Tommaso  and  Mukherjee  (1997),  Table  A-3, 

http://www-wds.worldbank.org/external/default. 
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Redistribution  and  Public  Sector  Relative  Wage:  Unconditional  Relationship 
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Figure  1.  The  figure  reports  the  fitted  values  of  the  unconditional  relationship  between  average  government 
wage  to  per  capita  GDP  and  central  government  expenditures  as  percentage  of  GDP  for  the  sample  of 
countries  with  an  average  polity  index  for  the  period  1991-98  greater  than  or  equal  to  5  (see  text  for  details). 
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Redistribution  and  Public  Sector  Relative  Wage:  Conditional  Relationship 
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Figure  2.  The  figure  reports  the  fitted  values  of  the  conditional  relationship  between  average  government 
wage  to  per  capita  GDP  and  government  expenditures  as  percentage  of  GDP  for  the  sample  of  countries  with 
an  average  polity  index  for  the  period  1991-98  greater  than  or  equal  to  5.  The  control  variables  are  the  log  of 
per  capita  real  GDP  and  the  average  polity  index  (see  text  for  details). 
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